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(57) Abstract: A novel approach to the early detection of colorectal cancer ("CRC"), using a molecular diagnostic test to evaluate 
grossly normal-appearing colonic tissue for the early detection of colorectal cancer is disclosed. Such grossly normal-appearing 
colonic mucosal cells may be collected from non-invasive or minimally invasive procedures. The use of novel biomarker panels 
for drug screening also is disclosed. Such biomaker panels may be used wholly or in part as surrogate endpoints for monitoring 
effectiveness of a prospective drug in the intervention of pathologies, such as cancers, for example CRC, lung, prostate, and breast, 
and neurodegenerative diseases, for example Alzheimer's and ALS. 
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Drug Screening and Molecular Diagnostic Test for Early Detection of Colorectal 
Cancer: Reagents, Methods, and Kits Thereof 

5 Claim of Priority 

U.S. Provisional Patent Application No. 60/614,746 entitled MOLECULAR 

DIAGNOSTIC TEST FOR EARLY DETECTION OF COLORECTAL CANCER: REAGENTS, 

METHODS, AND KITS THEREOF, by Nancy M. Lee, et a/., filed September 30, 2004 

(Attorney Docket No. NLEE-01001 US0); 
10 U.S. Provisional Patent Application No. 60/651,344 entitled METHODS OF USE OF 

A BIOMARKER PANEL FOR DRUG SCREENING, by Nancy M. Lee, etal., filed February 8, 

2005 (Attorney Docket No. NLEE-01002USO); and 

U.S. Patent Application No. 11/ , entitled DRUG SCREENING AND 

MOLECULAR DIAGNOSTIC TEST FOR EARLY DETECTION OF COLORECTAL 
15 CANCER: REAGENTS, METHODS, AND KITS THEREOF, by Nancy M. Lee, filed 

September 29, 2005 (Attorney Docket No. NLEE-01001US1). 

Cross-Reference to Related Applications 

This application is related to PCT/US2004/022594, entitled "Biomarker Panel for 
20 Colorectal Cancer," by Nancy M. Lee et al., filed July 14, 2004 (Attorney Docket No. NLEE- 
01000WO0), which claims priority to U.S. Provisional Application No. 60/488,660, entitled 
"Molecular Biomarker Panel for Determination of Colorectal Cancer," by Nancy M. Lee et al., 
filed July 18, 2003 (Attorney Docket No. CPMC-01000US0), and also to U.S. Patent 
Application No. 10/690,880, entitled "Biomarker Panel for Colorectal Cancer," by Nancy M. 
25 Lee et al., filed October 22, 2003 (Attorney Docket No. CPMC-01000US1), each of which is 
incorporated herein in full, by reference. 

Nucleotide and/or amino acid sequence listings are included in this application in 
computer-readable form and in hard-copy. The information included in computer-readable 
form is incorporated herein in full by reference. The information in computer-readable form 
30 is also included on diskette, and such information submitted on diskette is incorporated 
herein in full by reference. Compact diskette No. 1 contains the following file: 
NLEE1001WO0.ST25.txt (created 9/30/2005, 96K). The total number of diskettes submitted 
is one. 
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Backqround 

The field of art of this disclosure concerns reagents, methods, and kits for the early 
detection of colorectal cancer ("CRC"), and methods for drug screening effective in the 
treatment of pathologies, such as cancers, for example, CRC, lung, prostate, and breast, 
5 and neurodegenerative diseases, for example Alzheimer's and ALS. These reagents, 
methods, and kits are based on a panel of biomarkers that are useful for risk assessment, 
early detection, establishing prognosis, evaluation of intervention, recurrence of CRC and 
other such pathologies, and drug discovery for therapeutic intervention. 

In the field of medicine, clinical procedures providing for the risk assessment and 
10 early detection of CRC have been long sought. Currently, CRC is the second leading cause 
of cancer-related deaths in the Western world. One picture that has clearly emerged 
through decades of research into CRC is that early detection is critical to enhanced survival 
rates. 

Thus, one long-sought approach for the early detection of CRC has been the search 
15 for biomarkers that are effective in the early detection of CRC, and therefore that are 
effective for the treatment of CRC. For more than four decades, since the discovery of 
carcinogenic embryonic antigen ("CEA"), the search for biomarkers effective for early 
detection of CRC has continued. It is further advantageous for sampling methods used in 
conjunction with an early diagnostic test for CRC to be minimally invasive or non-invasive. 
20 Non-invasive and minimally invasive sampling methods increase patient compliance, and 
generally reduce cost. Additionally, bioinformatic methods for analysis of complex, 
multivariate data typical of bioanalysis, yielding a reliable diagnostic evaluation based on 
such data sets, are also desirable. 

Therapeutic intervention for numerous types of cancers, such as CRC, lung, 
25 prostate, and breast, includes surgery, chemotherapy, and radiation treatment, and 
combinations thereof. For CRC, a current area of continued research and development, in 
addition to search for non-invasive methods for early detection, is in the area of drug 
development. 

One picture that has clearly emerged through decades of research into CRC is that 
30 early detection, coupled with effective therapeutic intervention is critical to enhanced survival 
rates. To date, the most commonly used drug in the treatment of CRC is 5-fluoruracil 
("5FU"), which frequently is administered intravenously, in combination with the folic acid 
vitamin, leucovorin. A strategy referred to as primary chemotherapy is used when 
metastasis has occurred, and the cancer has spread to different parts of the body. For 
35 CRC, the current strategy for primary chemotherapy is the administration of an oral form of 
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5FU, capecitabine, in combination with Camptosar, a topoisomerase I inhibitor, or Eloxatin, 
an organometallic, platinum-containing drug that inhibits DNA synthesis. 

Currently, strategies for new drug development for CRC include two areas of 
research: angiogenesis inhibitors, and signal transduction inhibitors. 
5 Novel biopharmaceutical drugs include both protein- and ribozyme-based 

therapeutics. Humanized antibody-based therapeutics include examples such as Erbitux 
and Avastin. Erbitux, a signal transduction inhibitor, is aimed at inhibiting epidermal growth 
factor receptors ("EGFR") on the surface of cancerous cells. Avastin, an angiogenesis 
inhibitor, is aimed at inhibiting vascular endothelial growth factor ("VEGF"), which is known 

10 to promote the growth of blood vessels. Additionally, Angiozyme, an example of a 
ribozyme-based therapeutic, is an angiogenesis inhibitor directed against the expression of 
the VEGF-R1 receptor. New traditional small molecule-based drugs include examples such 
as Iressa, based on a quinazoline template, and acting as a signal transduction inhibitor, 
and SU11248, based on an indolinone template, which acts as an anti-angiogenesis 

15 inhibitor. 

Still, a number of potential drawbacks and uncertainties remain for these nascent 
drug therapies for CRC. In addition to typical contraindications such as nausea, vomiting, 
headache, and diarrhea, other more serious side effects, such as gastrointestinal 
perforation, elevated or lowered blood pressure, extreme fatigue, and internal bleeding have 

20 been observed for many of the promising candidates. Additionally, though many of the drug 
therapies based on angiogenesis inhibition or signal transduction inhibition appear 
promising, they are in the very early stages of clinical trials. 

Accordingly, a need exists in the art for biomarkers that are effective in the early 
detection of CRC, coupled with sampling methods that are minimally or non-invasive, and 

25 bioinformatic methods, which together produce a robust diagnostic test for the early 
detection of CRC. A need also exists in the art for drug development, which can provide 
effective treatment prior to the development of cancer for individuals diagnosed with 
pathologies, such as cancers, for example CRC, lung, prostate, and breast, and 
neurodegenerative diseases, for example Alzheimer's and ALS, while minimizing serious 

30 side effects. 

Brief Description of Figures 
Fig. 1 is a table listing an embodiment of sequence listings for a panel of biomarkers 
of the disclosed invention. 
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Fig. 2 is a distribution plot of control subjects versus test subjects evaluated using an 
aspect of the panel of biomarkers of Fig. 1 , and an aspect of a bioinformatic evaluation of 
the disclosed invention. 

Fig. 3 shows the distribution of the log (base2) expression values for genes, PPAR- 
5 y, IL-8, SAA 1 and COX-2 and their cut-off points. 

Figs. 4A and 4B show that expression of different genes is altered at different sites 
of MNCM from individuals with a family history of colon cancer. 

Fig. 5 displays a flow diagram of an aspect of the bioinformatic process used for 
evaluating data. 

10 Fig. 6 is an embodiment of a swab sampling and transport system for the minimally 

invasive sampling of colonic mucosal cells. 

Fig. 7 is a flow chart depicting one aspect of the drug screening disclosure. 
Fig. 8 is a flow chart depicting another aspect of the drug screening disclosure. 

15 Detailed Description 

To date, a greater understanding of the biology of CRC has been gained through the 
research on adenomatous polyposis coli ("APC"), p53, and Ki-ras genes, as well as the 
corresponding proteins, and related pathways involved regulation thereof. However, there is 
a distinct difference between research on a specific gene, its expression, protein product, 

20 and regulation, and understanding what genes are critical to include in a panel used for the 
analysis of CRC that is useful in the management of patient care for the disease. Panels 
that have been suggested for CRC are comprised of specific point mutations of the APC, 
p53, and Ki-ras, as well as BAT-26, which is a gene that is a microsatellite instability marker. 
For CRC, biomarkers for risk assessment and early detection of CRC long have 

25 been sought. The difference between risk assessment and early detection is the degree of 
certainty regarding acquiring CRC. Biomarkers that are used for risk assessment confer 
less than 100% certainty of CRC within a time interval, whereas biomarkers used for early 
detection confer an almost 100% certainty of the onset of the disease within a specified time 
interval. Risk factors may be used as surrogate end points for individuals not diagnosed 

30 with cancer, providing that there is an established relationship between the surrogate end 
point and a definitive outcome. An example of an established surrogate end point for CRC 
is the example of adenomatous polyps. What has been established is that the occurrence 
of adenomatous polyps is a necessary, but not sufficient condition for an individual later to 
develop CRC. This is demonstrated by the fact that 90% percent of all preinvasive 
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cancerous lesions are adenomatous polyps or precursors, but not all individuals with 
adenomatous polyps go on later to develop CRC. 

Adenomatous polyps have been established as surrogate end points for CRC, and 
adenomatous polyps are macroscopically identifiable by colonoscopy or sigmoidoscopy. 
5 During such invasive procedures, biopsy samples can be taken from polyps or lesions for 
histological evaluation of the tissue. The molecular diagnostic approach disclosed herein 
may be used on grossly normal-appearing colonic mucosal cells that are not from a 
macroscopically identifiable polyp or lesion. However, as further disclosed herein, an 
invasive procedure need not be used to obtain a patient sample for histological evaluation. 

10 A non-invasive or minimally-invasive procedure can be employed to obtain, for example, a 
blood sample, stool sample, or swab of grossly normal-appearing rectal cells, upon which a 
molecular diagnostic test can be performed to evaluate the presence or absence of CRC. 
No previously-described approach for early detection of CRC has disclosed the non-invasive 
or minimally invasive collection of grossly normal-appearing colonic mucosal cells (biopsy or 

15 swab of rectal cells), blood samples, and/or stool samples, followed by a molecular and/or 
protein expression diagnostic test, which can detect changes in the tissue before any 
untoward histological changes indicating CRC are manifest. 

Fig. 1 is a table that gives an overview of the sequence listings included with this 
disclosure. The table of Fig. 1 lists a panel of biomarkers useful in practicing the disclosed 

20 invention. One embodiment of a biomarker panel is the 16 identified coding sequences 
given by SEQ. ID NOs 1-16, while another embodiment of a biomarker panel is the 16 
identified proteins given by SEQ. ID NOs 17-32. These two embodiments represent 
molecular marker panels that provide the selectivity and sensitivity necessary for the early 
detection of CRC. It is to be understood that fragments and variants of the biomarkers 

25 described in the sequence listings are also useful biomarkers in embodiments of panels 
used for the early detection of CRC. What is meant by fragment is any incomplete or 
isolated portion of a polynucleotide or polypeptide in the sequence listing. Further, it is 
recognized that almost daily, new discoveries are announced for gene variants, particularly 
for those genes under intense study, such as genes implicated in diseases like cancer. 

30 Therefore, the sequence listings given are exemplary of what now is reported for a gene, but 
it is recognized that for the purpose of an analytical methodology, variants of the gene and 
their fragments also are included. 

In Fig. 1, the entries 1-16 in the table are one aspect of a panel of biomarkers, which 
are polynucleotide coding sequences, and include the name and abbreviation of the gene. 

35 Entries 17-32 in Fig. 1 are another embodiment of a panel of biomarkers, which are protein, 
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or polypeptide, amino acid sequences that correspond to the coding sequences for entries 
1-16. A biomarker, as defined by the National Institutes of Health ("NIH") is a molecular 
indicator of a specific biological property; a biochemical feature or facet that can be used to 
measure the progress of disease or the effects of treatment. A panel of biomarkers is a 
5 selection of biomarkers, which taken together can be used to measure the progress of 
disease or the effects of treatment. Biomarkers may be from a variety of classes of 
molecules. As previously mentioned, there remains a need for biomarkers for CRC having 
the selectivity and sensitivity required to be effective for early detection of CRC. Therefore, 
one embodiment of what is disclosed herein is the selection of an effective set of biomarkers 

10 that is differentiating in providing the basis for early detection of CRC. 

In one aspect of this disclosure, for the early detection of CRC, expression levels of 
polynucleotides indicated as SEQ. ID NOs 1-16 are determined from cells in samples taken 
from patients by non-invasive or minimally invasive methods. The contemplated methods 
include blood sampling, stool sampling, and rectal cell swabbing or biopsy. Such analysis of 

15 polynucleotide expression levels frequently is referred to in the art as gene expression 
profiling. For gene expression profiling, levels of mRNA in a sample are measured as a 
leading indicator of a biological state - in this case, as an indicator of CRC. One of the 
most common methods for analyzing gene expression profiling is to create multiple copies 
from mRNA in a biological sample (said sample taken from a patient as disclosed above, by 

20 non- or minimally-invasive methods) using a process known as reverse transcription. In the 
process of reverse transcription, the mRNA from the sample is isolated from cells in the 
biological sample, by methods well-known in the art. The mRNA then is used to create 
copies of the corresponding DNA sequence from which the mRNA was originally 
transcribed. In the reverse transcription amplification process, copies of DNA are created 

25 without the regulatory regions in the gene (i.e., introns). These multiple copies made from 
mRNA are therefore referred to as "cDNA," which stands for complementary, or copy DNA. 
Entries 33-64 are the sets of primers that can be used in the reverse transcription process 
for each biomarker gene listed in entries 1-16. All nucleotide and amino acid biomarker 
sequences identified in SEQ. ID NOs 1-64 are found in a printout attached and included as 

30 subject matter of this application, and are found on a diskette also included as part of this 
application and incorporated herein by reference. 

Since the reverse transcription procedure amplifies copies of cDNA proportional to 
the original level of mRNA in a sample, it has become a standard method that allows the 
identification and quantification of even low levels of mRNA present in a biological sample. 
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Genes either may be up-regulated or down-regulated in any particular biological state, and 
hence mRNA levels shift accordingly. 

In one aspect of this disclosure, a method for gene expression profiling comprises 
the quantitative measurement of cDNA levels for at least two of the biomarkers of the panel 
5 of biomarkers selected from SEQ. ID NOs. 1-16, in a biological sample taken from a patient 
by a non- or minimally-invasive procedure, such as blood sampling, stool sampling, rectal 
cell swabbing, and/or rectal cell biopsy. The tissue taken need not be apparently diseased; 
in fact, the disclosed invention is contemplated to be useful in evaluating even grossly 
normal-appearing cells for detection of CRC. Such a method for gene expression profiling 

10 requires the use of primers, enzymes, and other reagents for the preparation, detection, and 
quantifying of cDNAs. The method of creating cDNA from mRNA in a sample is referred to 
as the reverse transcriptase polymerase chain reaction ("RT-PCR"). The primers listed in 
SEQ. ID NOs 33-64 are particularly suited for use in gene expression profiling using RT- 
PCR based on the disclosed biomarkers in the biomarker panel. A series of primers were 

15 designed using Primer Express Software (Applied Biosystems, Foster City, CA). Specific 
candidates were chosen, and then tested to verify that only cDNA was amplified, and not 
contaminated by genomic DNA. The primers listed in SEQ. ID NOs 33-64 were specifically 
designed, selected, and tested accordingly. 

The primers listed in SEQ. ID NOs 33-64 are important in the step subsequent to 

20 creating cDNA from isolated cellular RNA, for quantitatively amplifying copies in the real time 
PCR of gene expression products of interest. Optimal primer sequence, and optimal primer 
length are key considerations in the design of primers. The optimal primer sequence may 
impact the specificity and sensitivity of the binding of the primer with the template. A primer 
length between 18-30 bases is considered an optimal range. Theoretically, 18 bases is the 

25 minimal length representing a unique sequence, which would hybridize at only one position 
in most eukaryotic genomes. The primers listed in SEQ. ID NOs 33-64 range in primer 
length between 21-27 bases, and were designed and validated to amplify cDNA for the 
panel of nucleotides selected from SEQ. ID NOs 1-16. The specificity of the primers was 
demonstrated by a single product on 10% polyacrylamide gel electrophoresis ("PAGE"), and 

30 a single dissociation curve of the PCR product. 

Once the primer pairs have been designed, and validated for specificity, they may be 
synthesized in large quantities, and stored for convenient future use. Since the PCR 
reaction is sensitive to buffer concentration and buffer constituents, primers should be 
maintained in a suitable diluent that will not interfere in the amplification reaction. One 

35 example of a suitable diluent is 10 mM Tris buffer, with or without 1mM EDTA, depending on 
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the assay sensitivity to EDTA. Alternatively, another example of a suitable diluent for the 
primers is deionized water that is nuclease-free. The primers may be aliquoted in 
appropriate containers, such as siliconized tubes, and lyophilized if so desired. The liquid or 
lyophilized samples are preferably stored at refrigeration temperatures defined as long-term 
5 for biological samples, which is between about -20C° to about -70C°. The concentration of 
primer in the amplification reaction is typically between 0.1 to 0.5 jaM. The typical dilution 
factor from the stock solution to the final reaction mixture is about 10 times, so that the 
aliquoted stock solution of the primers is typically between about 1 and 5 jaM. 

In addition to the specifically designed primers listed in SEQ. ID Nos. 33-64, 
10 reagents such as one including a dinucleotide triphosphate mixture having all four 
dinucleotide triphosphates (e.g., dATP, dGTP, dCTP, and dTTP), one having the reverse 
transcriptase enzyme, and one having a thermostable DNA polymerase, are required for 
RT-PCR. Additionally buffers, inhibitors, and activators also are required for the RT-PCR 
process. 

15 Fig. 2 depicts one aspect of a bioinformatic data reduction process used for the early 

detection of CRC, showing a distribution of Mahalanobis distance for 17 controls (left), 
compared with 14 individuals with family history of CRC (middle), and 24 individuals with 
polyps (right). Tissue samples taken from grossly normal-appearing colonic mucosal tissue 
were evaluated using the biomarker panel of polynucleotides selected from SEQ. ID NOs. 1- 

20 16. The means for the gene expression levels for each of the 16 genes represented by 
polynucleotides selected from SEQ. ID NOs 1-16 for each control and test subject were 
calculated in log base 2 domain. The multivariate means, in a 16 dimensional hyperspace, 
were then determined for the controls, based on a multivariate normal distribution, in order 
to establish limits of normal expression levels. For each control, the Mahalanobis distance 

25 ("M-dist") from the multivariate mean of the other 16 controls was measured, while the M- 
dist for each of the test subjects was determined from the multivariate mean of the 17 
controls. In each group displayed in Fig. 2, all the biopsies from a single individual form a 
vertical row. For the individuals with polyps, astericks mark the biopsies from individuals 
with hyperplastic polyps. The horizontal line indicates the 95th percentile of a chi-square 

30 distribution with 16 degrees of freedom. All values above this line (corresponding to an M- 
dist of about 25) are different from the mean of controls at a level of p < 0.05. The data 
presented clearly show that there is an altered gene expression pattern in grossly normal 
colonic mucosal tissue samples for the test subjects. The data accordingly demonstrate the 
enhanced sensitivity and selectivity of a diagnostic test using the biomarker panel of 

35 polynucleotides selected from SEQ. ID NOs. 1-16. 
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Fig. 3 displays a flow diagram 300 of an aspect of the bioinformatic process used for 
evaluating the data from samples analyzed using expression profiling of polynucleotides 
selected from SEQ. ID Nos. 1-16. The goal of the bioinformatic analysis used to analyze the 
gene expression data for the molecular diagnostic test using the panel of polynucleotides 
5 selected from SEQ. ID NOs 1-16 was to use a single, easy-to-calculate measure of 
abnormality. It is desirable to analyze expression patterns of all genes in the panel selected 
from SEQ. ID NOs 1-16 by multivariate analysis, since multivariate analysis determines the 
significance of changes of all expression levels, taken together. There are several kinds of 
multivariate tests which may be useful for the bioinformatic analysis used to assess the 
10 presence or absence of colorectal cancer in patient samples tested using the molecular 
diagnostic test disclosed herein. Examples of multivariate analysis tests useful in the 
assessment of data from patient samples tested using the panel of polynucleotide 
biomarkers selected from SEQ. ID NOs 1-16 include the ANOVA and the Mahalanobis 
distance ("M-Dist") tests. 

15 ANOVA is a global test that accounts for correlations among expression levels. It is 

desirable for the multivariate ANOVA tests to be based on Wilks' lambda criterion and to be 
carried out on log(base 2) values for the data obtained using the molecular diagnostic test 
using the panel of polynucleotides selected from SEQ. ID NOs 1-16 to achieve normal 
distribution of values. 

20 M-dist analysis is another example of a multivariate analysis that summarizes, in a 

single number, the differences between two patterns of gene expression, taking into account 
variability of each gene's expression and correlations among pairs of genes. M-dist is often 
used as a test for outliers (individual cases that are significantly different from all other 
individual cases in the group) in multivariate data. M-dist can be converted to p-values by 

25 reference to a chi-square distribution with degrees of freedom equal to the number of 
variables {i.e., genes). However, to avoid reliance on an assumption of multivariate 
normality, it is desirable to compare M-dist for individual cases {i.e., those with polyps) to 
controls using a rank sum test, the Mann-Whitney test. By using the Mann-Whitney 
analysis, the inferences concerning differences in expression patterns do not depend on the 

30 assumption of multivariate normality. Therefore, this method allows the determination of the 
significance of all the experimental subjects' expression levels taken together, as well as the 
significance of each individual expression value. 

A working example of the foregoing disclosure is provided below. Hao, C-Y, et al., 
Alteration of Gene Expression in Macroscopically Normal Colonic Mucosa from Individuals 

35 with a Family History of Sporadic Colon Cancer, 11 Clin. Cancer Res., 1400-07 (Feb. 15, 
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2005). The example presented is provided as a further guide to the practitioner of ordinary 
skill in the art, and is not to be construed as limiting the invention in any way. 

This example was undertaken to investigate whether expression of several genes 
was altered in morphologically normal colonic mucosa ("MNCM") of individuals who have not 
5 developed colon cancer, but are at high risk of doing so because of a family history of CRC. 
Human subjects 

Biopsies of MNCM from the rectum and sigmoid colon were performed at the time of 
routine colonoscopy from individuals seen at the California Pacific Medical Center ("CPMC") 
who had no history of prior colon cancer, and who were free of adenomatous polyps, colon 

10 cancer or other colonic lesions at the time of examination. Twelve individuals with a family 
history of colon cancer in a first-degree relative (Table 3) and sixteen individuals with no 
known family history of colon cancer were included in the study. Although the information of 
family cancer history is obtained by patients 1 self-reports without confirmation from the 
hospital's cancer registry, a recent study has confirmed the accuracy of self-reported family 

15 history with regard to colon cancer. Of the twelve individuals with a family history of colon 
cancer, two are mother and daughter (cases #6 and 7 in Table 3), two are sister and brother 
(cases #11 and 12), and the rest are not related. Study subjects ranged in age from 18 to 
64 years in the group with a family history of colon cancer, and 16 to 83 years in the control 
group (the 16-year-old had undergone colonoscopy for chronic abdominal pain). The 

20 research protocols for obtaining normal biopsy specimens for study were approved by the 
CPMC Institutional Review Board. The appropriate procedure for obtaining informed 
consent was followed for all study subjects. 

Extraction and preparation of RNA and cDNA 

Biopsy samples obtained from the segment of colon between the cecum and the 
25 hepatic flexure were classified as ascending colon samples; those from the segment of 
colon between the hepatic flexure and the splenic flexure as transverse colon samples; 
those from the segment of colon below the splenic flexure as descending colon; those from 
the winding segment of colon below the descending colon were classified as rectosigmoid 
colon samples (approximately 5-25 cm from rectum). The number of biopsy samples 
30 obtained from each patient varied. Two to eight biopsy samples were obtained from each 
colon segment, except that only one sample was obtained from the transverse and the 
descending colon segments in one subject of the family history group. A total of 39 
ascending colon, 37 transverse colon, 45 descending colon and 77 rectosigmoid specimens 
were obtained from the 12 individuals with a family history of colon cancer; and a total of 53 
35 ascending colon, 48 transverse colon, 49 descending colon and 104 rectosigmoid 
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specimens were obtained from the 16 individuals with no family history of colon cancer. All 
biopsy samples were snap-frozen on dry ice and taken immediately to the laboratory for 
RNA preparation and reverse transcription as described. 
Analysis of gene expression 
5 The expression levels of oncogene c-myc, CD44 antigen ("CD44"), cyclooxygenase 

1 and 2("COX-1" and "COX-2"), cyclin D1, cyclin-dependent kinase inhibitor (« P 21 C| P /Waf1 "), 
interleukin 8 ("IL-8"), interleukin 8 receptor ("CXCR2"), osteopontin ("OPN"), melanoma 
growth stimulatory activity ("Groa/MGSA"), GR03 oncogene ("Gro/'), macrophage colony 
stimulating factor 1 ("MCSF-1"), peroxisome proliferative activated receptor, alpha, delta and 

10 gamma ("PPAR-a, 6 and k") and serum amyloid A 1("SAA 1") were analyzed by quantitative 
RT-PCR. Quantitative RT-PCR were carried out. In brief, the cycle numbers ("C T value") 
were recorded when the accumulated PCR products crossed an arbitrary threshold. To 
normalize this value, a AC T value was determined as the difference between the C T value 
for each gene tested and the C T value for £-actin. The average AC T value for each gene in 

15 the control group was calculated. The AAC T value was determined as the difference 
between the AC T value for each individual sample and the average AC T value for this gene 
obtained from the control samples. These AAC T values were then used to calculate relative 
gene expression values as described. (Applied Biosystems, User Bulletin #2, December 11, 
1997). All PCR were performed in duplicate when cDNA samples were available. The 

20 results were also verified using histidyl-tRNA synthetase as internal control. Relative gene 
expression values yielded similar results using either £-actin or his-tRNA synthetase as a 
reference. Statistical analyses reported here were obtained using y?-actin as normalization 
controls. 

Statistical analysis 

25 Gene expression patterns were compared between individuals with a family history 

of colon cancer and the control group subjects who had no family history of colon cancer. 
Rather than testing expression of each gene separately and adjusting for multiple 
comparisons by methods that reduce statistical power, we tested the expression patterns of 
all genes by multivariate analysis of variance ("MAN OVA") with Wilks' lambda criterion. This 

30 test is a multivariate analog of the F-test for univariate analysis of variance, which tests the 
equality of means. This type of analysis takes into account correlations among gene 
expression levels and controls the false-positive rate by providing a single test of whether 
the expression patterns, based on all the genes in the subset, differ between groups. 

If there was evidence that expression patterns differed between groups, we used 

35 univariate t-tests to determine which genes were contributing to the global difference. All 
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MANOVA tests were based on the Wilks* lambda criterion and were carried out on log (base 
2) of the expression levels, since this transformation was required to achieve normal 
distributions. Our data consisted of a variable number of samples per subject with different 
numbers of individuals per group (family history vs. no family history). The analysis included 
5 random effects terms for individuals within group and for samples within individuals to 
account for the sampling scheme. If Y jjk denotes a log2 gene expression value for the k th 
sample from the j th patient from the i th group, the statistical model is described 
mathematically by the equation: Y ijk = M + Aj + By + e ijk , where Aj is the (fixed) group effect, 
By is the (random) patient effect, and e ijk is the (random) sample within patient effect. 

10 We also tested whether or not the magnitude of the differential expression (over or 

under expression) increased along the colon from the ascending portion toward rectum, by 
defining a variable with value 1 for samples from the ascending, 2 for samples from the 
transverse, 3 for samples from the descending and 4 for samples from the rectosigmoid 
portion of the colon. This variable was added to the model so that its effect could be tested 

1 5 for certain genes using univariate ANOVA. 
Definition of cut-off point 

The log (base 2) of the expression levels of all the biopsy samples from the control 
group was used to calculate the cut-off point for either up-regulation or down regulation of 
each gene. A table of tolerance bounds for a normal distribution was used to define cut-off 

20 points so that a fraction of the distribution of no more than P would lie above the cut-off point 
for up-regulated genes or below the cut-off point for down-regulated genes. Each cut-off 
point was defined by cut-off point = mean + /c(SD), where the mean and SD (Standard 
Deviation) are based on values from the control group. Values of k are found in the table 
and depend on the P value and the number of normal samples. Owen, D.B., Noncentral t 

25 and tolerance limits, in Brimbauim ZW, ed. Handbook of Statistical Tables, Reading, MA: 
Addison-Wesley, 1962, 108-127. Assuming a Gaussian distribution of expression levels of 
each gene, one would expect less than 1% of the biopsies from a normal population to have 
an expression level exceeding the 99% tolerance limit (p = 0.01). 

To calculate the probability that the number of observed samples outside the upper 

30 99 percentile was due to chance in each case, we used the binomial distribution method 
with p = 0.01 and n = the number of samples for each case multiplied by the number of 
genes tested. For example, for case #1 (Table 3) we had 2 samples; both showed 
abnormal expression for PPAR-k and SAA1, one of two for PPAR-<J and neither was 
abnormal for IL-8 and COX-2. Thus, for this case, 5 of 10 tested were beyond the upper 



35 
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0. 01 boundary. The probability that this happened by 

5n 

chance is 2.4 x 1C 8 . The general formula is given by: Pr{x > k | p,n} = E (0.01) j (0.99) 5rw 

i=k 

5 where k is the number beyond the 99 percentile and n is the number of samples (5 is the 
number of genes tested). 
Results 

Altered gene expression in the rectosigmoid mucosa of individuals with a family 
history of colon cancer: 

10 Twelve individuals (ten women and two men) comprised the group with a family 

history of colon cancer; 16 individuals (nine women and seven men) served as the control 
group. (Table 1 .) We analyzed a total of 92 ascending colon biopsy samples, 85 transverse 
colon samples, 94 descending colon biopsy samples and 181 rectosigmoid biopsy samples 
for levels of expression of 16 genes. Expressions of these genes are known to be altered in 

15 the late stages of human colon cancers. We have also shown that some of these genes are 
altered in the MNCM from surgical resections of colon cancer patients. 

Continuing to refer to Table 1, results represent analysis of 104 biopsy samples 
from the 16 individuals without family history and 77 biopsy samples from 12 individuals 
with family history of colon cancer in a first-degree relative. Samples were analyzed for 

20 gene expression as described in Methods. The numbers in the table represent the 
expression level relative to the average MC T of the control group. If there is no variation 
among individuals, the normal gene expression level in the control group should equal to 

1. Multivariate analysis using the Wilks Lambda criterion was carried out on log2 
expression values of the 16 genes to determine the significance of the difference between 

25 the two groups. Genes are listed from smallest to largest P value. 

Multivariate analysis of the expression values of all 16 genes indicated a significant 
difference in the biopsy samples from the rectosigmoid region (p = 0.01) between those with 
and those without a family history of sporadic colon cancer. Gene expression in biopsy 
samples from the descending, ascending and transverse colon did not vary significantly 

30 between these two groups of individuals (p = 0.06, 0.22 and 0.52 respectively). Most of the 
differences in rectosigmoid biopsy samples were contributed by just five of these genes 
(Table 1): PPAR-k, SAA1, IL-8, COX-2 and PPAR-<5. Similar to the alterations of gene 
expression in the MNCM of cancer patients, we found that the expression levels of SAA1, 
IL-8 and COX-2 were up-regulated and those of PPAR-k and PPAR-tf were down-regulated 

35 in the MNCM of individuals with a family history of sporadic colon cancer. 
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The mean (± SD) age in the family history group was younger (45 ± 12 years) than 
that of the control group (56 ± 16 years), presumably because of heightened awareness of 
the need for early colonoscopy in the group with a family history of colon cancer. In 
addition, there is a sex difference between these two groups (ten women and two men in the 
5 family history group versus nine women and seven men in the control group). However, we 
found that sex did not affect the level of gene expression (p=0.67). Moreover, there was no 
correlation between age and the expression levels of SAA1 , IL-8, COX2 and PPAR-k (all p > 
0.05) except for PPAR-<? 0.01). Nevertheless, abnormal expression (down-regulation) of 
PPAR-rf increases with age. Thus comparison between younger family history group and 
10 older controls, would be biased toward finding fewer, rather than more, abnormal 
expressions in the family history group. In other words, we may underestimate the 
incidence of altered expression of PPAR-£ in the family history group. 

Table 1. Gene expression levels in normal rectosigmoid biopsy samples from 
15 individuals with family history of colorectal cancer as compared with controls 



Genes 


Controls 


Patients with family history 


P Values 




(n= 


104) 


<n= 


=77) 






Range 


Mean ±(S.D.) 


Range 


Mean ±(S.D.) 




PPAR-k 


0.44 - 1 .65 


1.07 ±0.41 


0.20 - 2.59 


0.79 ± 0.40 


0.006 


SAA1 


0.17 - 22 


2.16 ±3.67 


0.33 - 2343 


151 ±452 


0.02 


IL-8 


0.14-13 


1.71 ± 1.94 


6.84-13 


6.84 ± 2.82 


0.02 


COX-2 


0.17 -1 8 


1.82 ±2.75 


0.24 - 30 


5.11 ±9.01 


0.07 


PPAR-tf 


0.39 - 2.66 


1.11 ± 0.48 


0.16-2.22 


0.89 ± 0.46 


0.07 


CD44 


0.35-4.13 


1.14 ±0.64 


0.11 -4.98 


1.41 ± 0.78 


0.12 


c-Myc 


0.24 - 3.66 


1.21 ±0.75 


0.26 - 4.31 


1.48 ± 0.82 


0.14 


MCSF-1 


0.38-22 


1.81 ± 2.59 


0.20-11 


2.04 ±2.19 


0.21 


Gro-a 


0.01 -51 


2.61 ± 5.48 


0.34-57 


5.76 ±11.63 


0.22 


Gro-K 


0.16-35 


2.18 ±4.29 


0.12-41 


2.55 ± 5.91 


0.25 


P21 


0.51 -2.15 


1.10 ± 0.62 


0.20-7.68 


0.90 ± 0.32 


0.27 


PPAR-o 


0.31 -2.38 


1.09 ± 0.55 


0.26-2.21 


1.00 ± 0.40 


0.54 


CXCR2 


0.22 - 1 3 


1.45 ± 1.78 


0.43 - 4.44 


1.49 ± 1.55 


0.55 


OPN 


0.19-1 3 


1.66 ± 2.05 


0.15 -1 2 


1.41 ± 1.92 


0.73 


CyclinD 


0.34 - 3.48 


1.28 ±0.85 


0.13-3.21 


1.29 ±0.79 


0.81 


COX-1 


0.27 - 5.97 


1.21 ±0.85 


0.25 - 2.63 


1.09 ±0.51 


0.87 



Comparison with cut-off points for "normal" gene expression 

Relative gene expression levels in the rectosigmoid samples varied among 
20 individuals, much more so in samples obtained from the individuals with a family history of 
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colon cancer than the corresponding values from the controls (Table 1). We therefore use 
the expression level of each gene in the control group to define the "normal" expression 
level for each gene by calculating a cut-off point (p = 0.01) for each gene. Figure 3 shows 
the distribution of the log (base2) expression values for genes, PPAR-k, IL-8, SAA 1 and 
5 COX-2 and their cut-off points. As expected, less than 1% of the biopsy samples from the 
control group had expression of these genes above or below the cut-off lines (p = 0.01, 
Figure 3). However, 21%, 12% and 8% of the biopsy samples from the family history group 
had expression of SAA1, IL-8 and COX-2, respectively, above the cut-off points, and 12% of 
them had expression of PPAR-k below the cut-off point (Table 2). 

0 

Table 2. Number of biopsy samples (N) with gene expression above/below the 
cut-off point in normal individuals and individuals with a family history of colon cancer 



Genes 


Biopsy samples from 
Normal Controls (n=104) 
N {%) 


Biopsy sainpies irum 
individuals with Family 
History (n= 77) 
N (%) 


PPAR-k 


0 


9(12%)ft 


SAAI 


0 


16(21%)** 


IL-8 


0 


9 (12%)** 


COX-2 


1 (1%)* 


6 ( 8%)** 


PPAR-<5 


0 


2 (3%)f 


Gro-K 


1 (1%)* 


2 (3%)* 


PPAR-a 


0 


2 (3%)f 


Gro-a 


0 


0 


MCSF-1 


1 (1%)* 


0 


OPN 


1 (1%)* 


0 


P21 


0 


0 


CD44 


1 (1%)* 


0 


CXCR2 


1 (1%r 


0 


c-Myc 


0 


0 


CyclinD 


0 


0 


COX-1 


0 


0 



15 t with gene expression level below the cut-off point 

* with gene expression level above the cut-off point 
t number of patients with alterations are listed in Table 3. 
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We next analyzed each individual in the family history group (Table 3). The number 
of biopsy samples which exhibited expression levels below (for PPAR-k and 6) or above (for 
IL-8, SAA1 and COX-2) the cut-off point (p=0.01) are indicated. Individuals with all the 
biopsy samples exhibiting expression levels within the normal range are indicated with a (-) 
5 sign. All the grandparents with colon cancers in this study are maternal. Ages of the family 
member when colon cancer was diagnosed are indicated as follows: *** indicates that colon 
cancer was diagnosed before 50 years of age; ** indicates before 60 years of age; and * 
indicates after 60 years of age. Ages of the rest of the family members when colon cancer 
was diagnosed are not available. None of the twelve patients in the family history group 

1 0 reported other types of cancer in the family except that father of the patient for case #1 0 had 
lung cancer in the 1970's. 

As evidenced in Table 3, for the five most commonly altered genes, nine of the 
twelve individuals with a family history of colon cancer had at least one biopsy sample with 
expression levels below or above the cut-off point. Two individuals (cases #1 and 2) had 

15 altered expression of three of these genes in apparently normal rectosigmoid mucosa. In 
contrast, only one of the sixteen individuals in the control group had altered expression of 
one of these five genes (see Table 2). The cut-off is set so that 1% of expressions could be 
false positives. However, the numbers of biopsy samples obtained from each individual are 
different. To make an adjustment for the number of specimens, we also calculated, for each 

20 case, the probability that the number of observed samples outside the upper 99 percentile 
was due to chance. This calculation was based on the binomial distribution. As shown in 
Table 3, the observed altered gene expression in seven of the twelve individuals of the 
family history group is unlikely due to chance (p < 0.01). In these seven cases, expressions 
of at least two of the five genes were altered. In addition, among the sixteen genes 

25 analyzed, PPAR-k and SAA1 are the most frequently altered genes that occurred in five of 
the twelve individuals with a family history of colon cancer (Table 3). 
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Expression of different genes are altered at different sites of MNCM from individuals 
with a family history of colon cancer. 

Analysis of individual cases from the family history group showed that different genes 
were altered in rectosigmoid biopsy samples in different subjects. For instance, SAA1 and 
5 PPAR-k were altered in case #3, IL-8 and SAA1 were altered in case #4; while COX-2 and IL- 
8 but not SAA1 were altered in case #8 (Figure 4A). In addition, some genes were altered in 
all the rectosigmoid biopsy samples from the same patient (such as SAA 1 in case #4 and IL-8 
in case #8), while others were only altered in some of these biopsy samples {i.e. SAA1 and 
PPAR-k in case #3, IL-8 in case #4 and COX-2 in case #8). In addition, some of these 

10 alterations are restricted to the rectosigmoid regions, such as IL-8 in case #4; while others can 
be extended to other regions of the colon, such as SAAI in case #4 (Figure 4B). 

We also observed that the difference in gene expression between the two groups of 
individuals increased along the length of the colon for PPAR-k (p=0.001 for trend) and SAA1 
(p < 0.001), but not for IL-8 (p = 0.20), COX2 (p = 0.58), nor PPAR-d (p = 0.54). These results 

15 suggest that there is an increasing abnormality along the colon going from the ascending to 
the rectal portion between the two groups of individuals that can be detected despite reduced 
numbers of samples toward the ascending portion in this study. 

From the foregoing example, it was possible to draw the following conclusions. 
Approximately 5-10% of colorectal cancers occur among patients with one of the two 

20 autosomal dominant hereditary forms of colon cancer (familial adenomatous polyposis and 
hereditary nonpolyposis colorectal cancer), or who have inflammatory bowel disease (Burt R., 
Peterson G.M. In: Young G., Rozen, P. & Levin, B. Saunders, ed. in Prevention and Early 
Detection of Colorectal Cancer, Philadelphia, 171-194 (1996)). Of the remaining colon 
cancers, approximately 20% are associated with a family history of colon cancer, which is 

25 associated with a two-fold increased risk of developing colon cancer (Smith R.A., von 
Eschenbach A.C., Wender R., et al., American Cancer Society guidelines for the early 
detection of cancer: update of early detection guidelines for prostate, colorectal, and 
endometrial cancers, and Update 2001-testing for early lung cancer detection, 51 CA Cancer 
J Clin. 38-75; quiz 77-80 (2001)). Although linkage to chromosomes 15q13-14 and 9q22.2- 

30 31.2 has been reported in a subset of patients with familial colorectal cancer (Wiesner G.L., 
Daley D., Lewis S., et a/., A subset of familial colorectal neoplasia kindreds linked to 
chromosome 9g22.2-31.2 y 100 Proc Natl Acad Sci USA, 12961-5 (2003)), the genetic basis 
for most of these cases is not known. In this study, we have demonstrated substantial 
alterations in the expression of PPAR-k, IL-8 and SAAI in the rectosigmoid MNCM from 

35 individuals with a family history of sporadic colon cancer, even though these individuals had 
no detectable colon abnormalities. Our previous study showed that, in addition to PPAR-k, IL- 
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8 and SAA1, expressions of PPAR-<J, p21, OPN, COX-2, CXCR2, MCSF-1 and CD44 were 
also altered significantly in the MNCM of colon cancer patients when compared to normal 
controls without colon cancer, polyps, or family history. These observations suggest that 
altered expression of genes related to cancer development in the MNCM may be a sequential 
5 event and may occur earlier than the appearance of gross morphological abnormalities. For 
example, altered expression of PPAR-k, SAA1 and IL-8 may occur in MNCM of individuals 
who have not developed colon cancer, but are at high risk of doing so; while altered 
expressions of other genes, such as PPAR-J, p21, OPN, COX-2, CXCR2, MCSF-1 and CD44, 
may occur later in MNCM of individuals who have already developed a colon cancer (Chen L- 

10 C, Hao C-Y, Chiu Y.S.Y., et a/., Alteration of Gene Expression in Normal Appearing Colon 
Mucosa ofAPC 71 " 1 Mice and Human Cancer Patients, 64 Cancer Research 3694-3700 (2004)). 

Genetic and epigenetic changes have been reported in macroscopically normal 
tissues for several neoplasms (Tycko B., Genetic and epigenetic mosaicism in cancer 
precursor tissues, 983 Ann N Y Acad ScL, 43-54 (2003)). For example, allelic loss has been 

15 demonstrated in normal breast terminal ductal lobular units adjacent to primary breast 
cancers. (Deng G., Lu Y M Zlotnikov G., Thor A.D., Smith H.S., Loss of heterozygosity in 
normal tissue adjacent to breast carcinomas, 274 Science, 2057-9 (1996)). Such allelic loss 
is associated with an increased risk of local recurrence (Li Z., Moore D.H., Meng Z.H., Ljung 
B.M., Gray J.W., Dairkee S.H., Increased risk of local recurrence is associated with allelic loss 

20 in normal lobules of breast cancer patients, 62 Cancer Res., 1000-3 (2002)). In addition, 
normal-appearing colonic mucosal cells from individuals with a prior colon cancer are more 
resistant to bile acid-induced apoptosis than mucosal cells from individuals with no prior colon 
cancer (Bernstein C, Bernstein H., Garewal H., et al, A bile acid-induced apoptosis assay for 
colon cancer risk and associated quality control studies, 59 Cancer Res., 2353-7 (1999); and 

25 Bedi A., Pasricha P.J., Akhtar A. J., et a/., Inhibition of apoptosis during development of 
colorectal cancer., 55 Cancer Res., 1811-6 (1995)). Since apoptosis is important in colonic 
epithelium to eliminate cells with unrepaired DNA damage (Payne CM., Bernstein H., 
Bernstein C, Garewal H., Role of apoptosis in biology and pathology: resistance to apoptosis 
in colon carcinogenesis, 19 Ultrastruct Pathol., 221-48 (1995)), reduction in apoptosis could 

30 result in the retention of DNA-damaged cells and increase the risk of carcinogenic mutations. 

PPAR-k is down-regulated in several carcinomas. Ligands of PPAR-k inhibit cell 
growth and induce cell differentiation (Kitamura S., Miyazaki Y., Shinomura Y., Kondo S., 
Kanayama S., Matsuzawa Y., Peroxisome proliferator-activated receptor gamma induces 
growth arrest and differentiation markers of human colon cancer cells, 90 Jpn J Cancer Res 

35 75-80 (1999)), and loss-of-function mutations in PPAR-k have been reported in human colon 
cancer (Sarraf P., Mueller E., Smith W.M., et ai, Loss-of-function mutations in PPAR gamma 
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associated with human colon cancer, 3 Mol. Cell, 799-804 (1999)). Thus, our observation of 
down-regulation in PPAR-k expression in MNCM may represent an early event that promotes 
colonic epithelial cell growth and inhibits cellular differentiation. In addition, PPAR-k also 
negatively regulates inflammatory response (Welch J.S., Ricote M., Akiyama T.E., Gonzalez 
5 F.J., Glass C.K., PPAR gamma and PPAR delta negatively regulate specific subsets of 
lipopolysaccharide and IFN-gamma target genes in macrophages, 1 00 Proc Natl Acad Sci U S 
A 6712-7 (2003)). Inflammation favors tumorigenesis by stimulating angiogenesis and cell 
proliferation (Nakajima N., Kuwayama H., Ito Y., Iwasaki A., Arakawa Y„ Helicobacter pylori, 
neutrophils, interleukins, and gastric epithelial proliferation, 25 Suppl. 1 J Clin Gastroenterol., 

10 98-202 (1997)). Similarly, IL-8 and the acute-phase protein SAA1 modulate the inflammatory 
process (Dhawan P., Richmond A., Role of CXCL 1 in tumorigenesis of melanoma, 72 J 
Leukoc Biol., 9-18 (2002); and Urieli-Shoval S., Linke R.P., Matzner Y., Expression and 
function of serum amyloid A, a major acute-phase protein, in normal and disease states, 7 
Curr Opin Hematol., 64-9 (2000)). Up-regulation of pro-inflammatory cytokines and acute 

15 phase proteins has been reported in the colon mucosa of individuals with inflammatory bowel 
disease (Niederau C, Backmerhoff F., Schumacher B., Inflammatory mediators and acute 
phase proteins in patients with Crohn's disease and ulcerative colitis, 44 
Hepatogastroenterology, 90-107 (1997); and Keshavarzian A., Fusunyan R.D., Jacyno M. f 
Winship D., MacDermott R.P., Sanderson I.R., Increased interleukin-8 (IL-8) in rectal dialysate 

20 from patients with ulcerative colitis: evidence for a biological role for IL-8 in inflammation of the 
colon, 94 Am J Gastroenterol., 704-12 (1999)), who are at very high risk of developing colon 
cancer (Bachwich D.R., Lichtenstein G.R., Traber P.G., Cancer in inflammatory bowel 
disease, 78 Med Clin North Am., 1399-412 (1994)). Epidemiological observations also 
suggest that chronic inflammation predisposes to colorectal cancer (Rhodes J.M., Campbell 

25 B.J., Inflammation and colorectal cancer: IBD-associated and sporadic cancer compared, 8 
Trends Mol Med., 10-6 (2002); and Farrell R.J., Peppercorn M.A., Ulcerative colitis, 359 
Lancet 331-40 (2002)). Thus, the observation of down-regulation of PPAR-k and up- 
regulation of IL-8 and SAA1 in the normal mucosa of individuals with a family history of 
sporadic colon cancer and individuals with inflammatory bowel disease may indicate the 

30 involvement of common pathways leading to colon carcinogenesis in these two groups. 

Our observation of altered expression of genes associated with cancer and 
inflammation in normal colonic mucosa in some individuals with a family history of colon 
cancer is consistent with the recent report of association of elevated serum C-reactive protein 
("CRP") concentration prior to the development of colon cancer (Erlinger T.P., Platz E.A., Rifai 

35 N M Helzlsouer K.J., C-reactive protein and the risk of incident colorectal cancer., 291 JAMA, 
585-90 (2004)). These findings suggest that inflammation is a risk factor for the development 
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of colon cancer in average-risk individuals (id.). However, CRP is a nonspecific marker of 
inflammation that may indicate inflammation in tissues other than colon. In our study, we have 
analyzed the tissue where colon cancer arises and would be more specific in assessing the 
risk of developing colon cancer. 
5 We do not know which cell type is responsible for the observed altered gene 

expression. There are many cell types in the colonic mucosa, including several types of 
mucosal epithelial cells, stromal cells and blood-born cells. Studies from our group and 
others have demonstrated that the up-regulation of COX-2 protein in MNCM is localized 
primarily to the infiltrating macrophages and secondarily to the epithelial cells in aberrant crypt 

10 foci in the MNCM of APC min mice (Chen L-C, Hao C-Y, Chiu Y.S.Y., et a/., Alteration of Gene 
Expression in Normal Appearing Colon Mucosa of APC min Mice and Human Cancer Patients, 
64 Cancer Research 3694-3700 (2004); and Hull M.A., Booth J.K., Tisbury A., et a/., 
Cyclooxygenase 2 is up-regulated and localized to macrophages in the intestine of Min mice, 
79 Br J Cancer, 1399-405 (1999)). From our previous studies of MNCM of APC min mice, 

15 detection of the gene products that are up- or down- regulated in MNCM by 
immunohistochemical staining was found to be technically difficult, perhaps because the 
secreted proteins, such as IL-8 and SAA1, are evanescent in tissue sections (Chen L-C, Hao 
C-Y, Chiu Y.S.Y., et a/., Alteration of Gene Expression in Normal Appearing Colon Mucosa of 
APC™ in Mice and Human Cancer Patients, 64 Cancer Research 3694-3700 (2004)). Due to 

20 the limited amount of the biopsy samples and technical difficulties, we were unable to perform 
immunohistochemical staining to demonstrate the cell types contributing to the altered gene 
expression. If the absolute RNA quantities are sufficient, RNA in situ hybridization may be a 
better method to determine the cellular locations of alterations. Alternatively, laser 
microdissection followed by RT-PCR may be able to define the cell types involved. 

25 Regardless of the cell types responsible for the altered gene expression, our results 
demonstrate that relative to normal individuals without family history of colon cancer, altered 
gene expression is present in normal colon mucosa of some individuals with a family history 
of colon cancer and these individuals are known to have an increased risk of developing colon 
cancer (Burt R., Peterson G.M. In: Young G., Rozen, P. & Levin, B. Saunders, ed. in 

30 Prevention and Early Detection of Colorectal Cancer, Philadelphia, 1 71 -1 94 (1 996)). 

Among patients with altered gene expression in the rectosigmoid biopsy samples, 
some showed alterations in all biopsy samples {i.e., expression of SAA1 in cases #4 and 12), 
while others showed altered expression in some biopsy samples only (i.e., PPAR-k in cases 
#2 and #3, figure 2). Since most samples were assayed with multiple genes in duplications to 

35 ensure the quality of cDNA, such heterogeneity is unlikely due to technical variation. We 
speculate that this heterogeneity might reflect the frequency and/or the distribution of "hot 
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spots" in these individuals. It is possible that the individuals with altered gene expression in 
all rectosigmoid biopsy samples may have wide-spread molecular abnormalities in their 
rectosigmoid mucosa, while those with altered expression in some of the biopsy samples 
have discrete hot spots. Thus, individuals in the former group may have a global 
5 predisposition to development of colon polyps or cancer, while those in the latter group may 
have local predisposition. Whether the risks in developing colon cancer or polyps differ 
between these two groups is unknown. In addition, altered expression of different 
combination of genes were observed in the rectosigmoid biopsy samples of individuals in the 
family history group. This observation suggests that different molecular pathways may be 
10 involved in the early stages of colon carcinogenesis. Whether altered gene expression in 
certain molecular pathways is associated with higher risk of polyps or cancer also remains to 
be determined. 

Consistent with the reports of more aberrant crypt foci (the preneoplastic colonic 
lesions) in the distal colon than in the proximal colon of the sporadic colon cancer patients 

15 and the carcinogen-treated mice (Shpitz B., Bomstein Y., Mekori Y., et a/., Aberrant crypt foci 
in human colons: distribution and histomorphologic characteristics, 29 Hum Pathol., 469-75 
(1998); and Salim E.I., Wanibuchi H., Morimura K., et a/., Induction of tumors in the colon 
and liver of the immunodeficient (SCID) mouse by 2-amino-3-methylimidazo[4,5-f Jquinoline 
(IQ)-modulation by long chain fatty acids, 23 Carcinogenesis, 1519-29 (2002)), we found that 

20 most of the alterations in gene expression were found in the distal colon of the individuals 
from the family history group. We speculate that the distal colon mucosa of the susceptible 
individuals may be exposed to higher concentration of exogenous substances present in the 
stool than mucosa in other colon regions after most of the water is re-absorbed at the end of 
the large intestine, and such exposure may lead to higher rate of altered gene expression at 

25 this region. 

We have shown that family history of colon cancer, but not age or sex, is the factor 
responsible for the observed differences in gene expression in the rectosigmoid mucosa of 
the two groups. The available information did not indicate any specific difference in diet or 
medication between these two groups of patients. However, we cannot eliminate the 

30 possibility that diet or medication affect gene expression without further study. Not all 
individuals with a family history of colon cancer will develop cancer or adenomatous polyps of 
the colon (Smith, R.A., von Eschenbach A.C., Wender, R., et aL, American Cancer Society 
guidelines for the early detection of cancer: update of early detection guidelines for prostate, 
colorectal, and endometrial cancers, and Update 2001 — testing for early lung cancer 

35 detection, 51 CA Cancer J. Clin., 38-75; quiz 77-80 (2001).). Consistent with this clinical 
observation, our analysis also showed that not all the individuals with a family history of colon 
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cancer have altered gene expression in MNCM. Since the genes analyzed in this study are 
involved in the development of colon cancer, we hypothesize that individuals with altered 
gene expression in the MNCM may be more susceptible to developing polyps or cancer than 
those without altered gene expression. To test this hypothesis, a prospective study with a 
5 larger number of study subjects will be needed. If such an association is confirmed, it may be 
possible to identify individuals at increased risk of developing colon cancer by using gene 
expression analysis of rectosigmoid biopsy samples. Theoretically, it is easier to identify 
individuals with global alterations in the MNCM than individuals with local alterations by 
analysis of random MNCM samples. However, if an appropriate panel of genes was selected 
10 for analysis using multiple samples, it may have enough predictive power to identify such 
patients. 

Turning now to Fig. 5, various aspects of Fig. 5 may be implemented using a 
conventional general purpose or specialized digital computer(s) and/or processor(s) 
programmed according to the teachings of the present disclosure, as will be apparent to those 

15 skilled in the computer arts. Appropriate software coding can be prepared readily by skilled 
programmers based on the teachings of the present disclosure, as will be apparent to those 
skilled in the software arts. The invention also may be implemented by the preparation of 
integrated circuits and/or by interconnecting an appropriate network of component circuits, as 
will be readily apparent to those skilled in the arts. 

20 Various aspects include a computer program product which is a storage medium 

having instructions and/or information stored thereon/in which can be used to program a 
general purpose or specialized computing processor(s)/device(s) to perform any of the 
features presented herein. The storage medium can include, but is not limited to, one or 
more of the following: any type of physical media including floppy disks, optical discs, DVDs, 

25 CD-ROMs, microdrives, magneto-optical disks, holographic storage devices, ROMs, RAMs, 
EPROMs, EEPROMs, DRAMs, PRAMS, VRAMs, flash memory devices, magnetic or optical 
cards, nano-systems (including molecular memory ICs); paper or paper-based media; and 
any type of media or device suitable for storing instructions and/or information. Various 
aspects include a computer program product that can be transmitted in whole or in parts and 

30 over one or more public and/or private networks wherein the transmission includes 
instructions and/or information which can be used by one or more processors to perform any 
of the features presented herein. In various aspects, the transmission may include a plurality 
of separate transmissions. 

Stored on one or more of the computer readable medium (media), the present 

35 disclosure includes software for controlling both the hardware of general purpose/specialized 
computer(s) and/or processor(s), and for enabling the computer(s) and/or processor(s) to 
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interact with a human user or other mechanism utilizing the results of the present invention. 
Such software may include, but is not limited to, device drivers, operating systems, execution 
environments/containers, user interfaces and applications. 

The execution of code can be direct or indirect. The code can include compiled, 
5 interpreted and other types of languages. Unless otherwise limited by claim language, the 
execution and/or transmission of code and/or code segments for a function can include 
invocations or calls to other software or devices, local or remote, to do the function. The 
invocations or calls can include invocations or calls to library modules, device drivers and 
remote software to do the function. The invocations or calls can include invocations or calls in 

10 distributed and client/server systems. 

Fig. 6 depicts an aspect of this disclosure having a swab sampling and transport 
system 400 for the minimally invasive sampling of colonic mucosal cells. The system 400 of 
Fig. 6 is comprised of a swab 410 and a container 420. A container 420, such as one 
depicted by the aspect of the disclosure shown in Fig. 6, is configured to stabilize, extract, 

15 and store the sample of colonic mucosal cells until the diagnostic test for early detection of 
CRC using the disclosed biomarker panel can be done on the sample. 

The swab 410 has a tip 412 extending from the end of a shaft 414. The tip 410 may 
be of a number of shapes such as oblate, square, rectangular, round, etc., and has a 
maximum width of about 0.5 cm to 1.0 cm, and a length of about 1.0 cm to 10.0 cm around 

20 the end of the rod. The tip 412 may be composed of a number of materials, such as cotton, 
rayon, polyester, and polymer foam, for example, or combinations of such materials. The 
shaft 414 is made of a material with sufficient mechanical strength for effectively swabbing the 
rectal area, but with enough flexibility to prevent injury. Examples of shaft materials having 
the strength and flexibility properties for a rectal swab include wood, paper, and a variety of 

25 polymeric materials, such as polyester, polystyrene, and polyurethane, and composites of 
such polymers. 

The container 420 has a body 412 and a cap 424. The body 41 2 may have a variety 
of lengths and diameters to accommodate a swab 410 having dimensions of the tip 412 and 
the range of lengths of the shaft 414 as described in the above. The body 412 of the 

30 container may be made of a number of polymeric materials, such as polyethylene, 
polypropylene, polycarbonate, polyfluorocarbon, or glass, while the cap 424 typically is made 
of a desirable polymeric material, such as the examples given for the body 412. The 
container 420 has a reagent 426 in the bottom that is suitable for stabilizing and extracting the 
colonic mucosal cells collected on the swab 41 0 when swabbing of the rectal area is done as 

35 a minimally invasive sampling technique. Additionally, a container 420 having a reagent 426 
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suitable for stabilizing and extracting a sample of colonic mucosal cells from a stool sample 
may be used without the need for the swab 410. 

The reagent 426 contains a buffered solution of guanidine thiocyanate in a 
concentration of at least about 0.4M and other tissue denaturing reagents such as a biological 
5 surfactant in a concentration of at about between 0.1 to 10%. Desirable biological surfactants 
can be zwitterionic, such as CHAPS or CHAPSO, non-ionic, such as TWEEN, or any of the 
alkylglucoside surfactants, or ionic, such as SDS. A variety of buffers, for example, those 
generally known as Good's buffers, such as Tris, may be used. The concentration of the 
buffer may vary in order to buffer the reagent 426 effectively to a pH of between about 7.0 to 
10 8.5. 

It is further contemplated that the sample taken using an aspect of the disclosure as in 
Fig. 6 of a swab sampling and transport system 400 can be processed and the data analyzed 
in a single apparatus using the computer hardware and software disclosed above. That is, 
the sample obtained from the aspect of the disclosure of Fig. 6 can be analyzed according to 

15 Fig. 5 in a single apparatus. However, it is also contemplated that a patient's blood or stool 
sample can be analyzed in the single apparatus. In one embodiment, one aspect of the 
apparatus is a first component that is used to carry out RT-PCR for a sample from a patient 
for gene expression profiling, as described above. Gene expression profiling allows 
quantifying of cDNA of SEQ. ID Nos 1-16, which is reverse-transcribed from mRNA made by 

20 cells in the sample from the patient. The sets of primers from SEQ. ID Nos 33-64 are used in 
the RT-PCR reaction to prime strands of mRNA corresponding to SEQ. ID Nos 1-16, and 
thereby to synthesize cDNA corresponding to SEQ. ID Nos 1-16. 

After obtaining the cDNAs from the RT-PCR, data are compared by a second 
component of the apparatus to control data already stored in the apparatus on a storage 

25 medium. Multivariate analysis as disclosed above is applied using software to execute 
instructions for the ANOVA, M-Dist, or other means of multivariate analysis. Based on the 
statistical analysis, a qualified diagnostician can assess the presence or absence of CRC, the 
progress of CRC, and/or the effects of treatment of CRC. 

In a further aspect of this disclosure, protein expression profiling of patient samples 

30 can be carried out for early detection of CRC, using a single apparatus. The term 
"polypeptide" or "polypeptides" is used interchangeably herein with the term "protein" or 
"proteins." As discussed previously, proteins long have been investigated for their potential 
as biomarkers, with limited success. There is value in protein biomarkers as complementary 
to polynucleotide biomarkers. Reasons for having the information provided by both types of 

35 biomarkers include the current observations that mRNA expression levels are not good 
predictors of protein expression levels, and that mRNA expression levels tell nothing of the 
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post-translational modifications of proteins that are key to their biological activity. Therefore, 
in order to understand the expression levels of proteins, and their complete structure, the 
direct analysis of proteins is desirable. 

Disclosed herein are proteins listed in SEQ. ID NOs 17-32, which correspond to the 
5 genes indicated in SEQ. ID NOs 1-16. A further aspect of the disclosed invention is to 
determine expression levels of the proteins indicated by SEQ. ID NOs. 17-32. A sample from 
the patient, taken by non- or minimally-invasive methods as disclosed above, can be used to 
prepare fixed cells or a protein extract of cells from the sample. The cells for protein 
expression profiling can be obtained either through the method of Fig. 6, or alternatively for 

10 example by a blood sample or stool sample, or other non-invasive or minimally invasive 
method (or of course by more conventional invasive methods, including for example 
sigmoidoscopy and other procedures). 

In a first component of the apparatus, the cells or protein extract can be assayed with 
a panel of antibodies - either monoclonal or polyclonal - against the claimed panel of 

15 biomarkers for measuring targeted polypeptide levels. The objective of the assay is to detect 
and quantify expression of proteins corresponding to the biomarker gene sequences in SEQ. 
ID NOs 1-16, i.e., SEQ. ID NOs 17-32. 

In one aspect of the disclosure contemplated for the method, the antibodies in the 
antibody panel, which are based on the panel of biomarkers, can be bound to a solid support. 

20 The method for protein expression profiling may use a second antibody having specificity to 
some portion of the bound, targeted polypeptide. Such second antibody may be labeled with 
molecules useful for detecting and quantifying the bound polypeptides, and therefore in 
binding to the polypeptide, label it for detection and quantification. Additionally, other 
reagents are contemplated for labeling the bound polypeptides for detection and 

25 quantification. Such reagents may either directly label the bound polypeptide or, analogous to 
a second antibody, may be a moiety with specificity for the bound polypeptide having labels. 
Examples of such moieties include but are not limited to small molecules such as cofactors, 
substrates, complexing agents, and the like, or large molecules such as lectins, peptides, 
oligonucleotides, and the like. Such moieties may be either naturally occurring or synthetic. 

30 Examples of detection modes contemplated for the disclosed methods include, but are 

not limited to spectroscopic techniques, such as fluorescence and UV-Vis spectroscopy, 
scintillation counting, and mass spectroscopy. Complementary to these modes of detection, 
examples of labels for the purpose of detection and quantitation used in these methods 
include, but are not limited to chromophoric labels, scintillation labels, and mass labels. The 

35 expression levels of polynucleotides and polypeptides measured in a second component of 
the apparatus using these methods may be normalized to a control established for the 



WO 2006/039405 



PCT/US2005/035027 



-27- 

purpose of the targeted determination. The control data is stored in a computer which is a 
third component of the apparatus. 

A fourth software component compares the data obtained from a patient's or a 
plurality of patients' samples to the control data. The comparison will comprise at least one 
5 multivariate analysis, and can include ANOVA, MANOVA, M-Dist, and others known to those 
of ordinary skill in the art. Once the statistical analysis and comparison is performed and 
complete, a physician or other qualified person can make a diagnosis concerning the patient's 
or patients' CRC status. 

Turning now to the drug screening aspect of the present disclosure, it is noted that the 

10 panel of biomarkers disclosed herein are genes and expression products thereof that also are 
known to be involved in the following metabolic pathways and processes: 1) oxidative 
stress/inflammation; 2) APC/b-catenin pathway; 3) cell cycle/ transcription factors; and 4) 
actions of cytokines and other factors involved in cell/cell communications, growth, repair and 
response to injury or trauma. There is increasing evidence that these pathways, and hence 

15 members of the subject panel of biomarkers, are also involved in many other kinds of cancers 
than CRC, such as lung, prostate and breast, as well as neurodegenerative diseases, such as 
Alzheimer's and amyotrophic lateral sclerosis ("ALS"). In such pathologies, genes and 
expression products thereof involved in these pathways are fundamental to the growth, 
maintenance and response to stress of cells of many different types. During a pathology such 

20 as cancer or neurodegeneration, altered expression of certain altered genes results in a 
pathological symptom or symptoms, so that a shift in those genes, and expression products 
thereof, are characteristic biomarkers of that particular pathology. In that regard, seemingly 
unrelated pathologies, such as various cancers and neurodegenerative diseases, are 
manifestations of very complex pathologies that each involve discrete members of the subject 

25 biomarkers, which are genes and expression products thereof drawn from the above group of 
pathway and processes. As practical evidence of this, it is now appreciated that COX-2 
inhibitors have therapeutic value for a wide variety of disorders, including not only colon and 
other cancers, but for some neurodegenerative diseases as well. 

What is disclosed herein is the use of the subject biomarker panel in Fig. 1 in the drug 

30 discovery process for pathologies such as cancers, for example CRC, lung prostate, and 
breast, and neurodegenerative diseases, for example Alzheimer's and ALS. As mentioned in 
the above, the discrete pattern of altered genes and expression products thereof provides a 
unique signature for each specific disease, so the panel provides the necessary selectivity for 
a variety of pathologies. What is meant by drug is any therapeutic agent that is useful in the 

35 treatment of a pathology. This includes traditional synthetic molecules, natural products, 
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natural products that are synthetically modified, and biopharmaceutical products, such as 
polypeptides and polynucleotides, and combinations, extracts and preparations thereof. 

Drug screening is part of the first stage of drug development referred to as the drug 
discovery phase. Prospective drugs that are qualified through the drug screening process are 
5 typically referred to as leads, which is to say that in passing the criteria of the screening 
process they are advanced to further testing in a stage of drug discovery generally referred to 
as lead optimization. If passing the lead optimization stage of drug discovery, the leads are 
qualified as candidates, and are advanced beyond the drug discovery stage to the next stage 
of drug development known as preclinical trials, and are referred to as investigative new 

10 drugs ("IND"). If the IND is advanced, it is advanced to clinical trials, where it is tested in 
human subjects. Finally, if the IND shows promise through the clinical trial stage, after 
approval from FDA, it may be commercialized. The entire drug development process for a 
single candidate is known to take 10-15 years and hundreds of millions of dollars in 
development costs. For that reason, the current strategy within the pharmaceutical drug 

15 development community is to focus on the drug discovery stage as effective in weeding out 
prospective drugs efficiently, and advancing only candidates with high potential for success 
through the remaining drug development cycle. 

In the screening stage of drug discovery, a specific assay for evaluating prospective 
drugs is performed against a qualified biological model system for which a specific endpoint is 

20 monitored. A biomarker panel that is used as a surrogate endpoint for drug screening for 
pathologies, such as cancers, for example CRC, lung, prostate, and breast, and 
neurodegenerative diseases, for example Alzheimer's and ALS, is not only a panel useful for 
early detection of such pathologies, but additionally demonstrates modulation by a drug in a 
fashion that correlates with a decrease in the pathology occurrence or recurrence. 

25 Additionally, one or more members of a biomarker panel useful in the early detection of such 
pathologies may also be useful as targets for drug screening for such pathologies. As will be 
discussed subsequently, the biomarkers described by Fig. 1 may be useful both as surrogate 
endpoints in model biological systems, as well as targets in drug screening. 

During the screening phase, large libraries of prospective drugs may be evaluated, 

30 representing a throughput of tens of thousands of compounds over a single screening 
regimen. What is regarded as low-throughput screening ("LTS") is about 10,000 to about 
50,000 prospective drugs, while medium-throughput screening ("MTS") represents about 
50,00 to about 100,00 prospective drugs, and high-throughput screening ("HTS") is 100,000 
to about 500,000 prospective drugs. 

35 What is meant by screening regimen includes both the testing protocol and analytical 

methodology by which the screening is conducted. The screening regimen, then, includes 
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factors such as the type of biological model that will be used in the test; the conditions under 
which the testing will be conducted; the type of prospective drug candidates, or library of 
prospective candidates that will be used; the type of equipment that will be used; and the 
manner in which the data are collected, processed, and stored. The scale of the screening 
5 regimen --LTS, MTS, or HIS - is impacted by factors such as testing protocol (e.g., type of 
assay), analytical methodology (e.g., miniaturization, automation), and computational 
capability and capacity. What is meant by biological model system includes whole organism, 
whole cell, cell lysate, and molecular target. What is meant by prospective drug candidate is 
any type of molecule, or preparation or suspension of molecules, under consideration for 

10 having therapeutic use. For example, the prospective drug candidates could be synthetic 
molecules, natural products, natural products that are synthetically modified, and 
biopharmaceutical products, such as polypeptides and polynucleotides, and combinations, 
extracts, and preparations thereof. 

As discussed above, Fig. 1 provides sequence listings of a panel of biomarkers useful 

15 in practicing the disclosed invention. One aspect of the disclosure is a biomarker panel of 16 
identified coding sequences given in SEQ. ID NOs 1-16, while another aspect of a biomarker 
panel is the 16 identified proteins given by SEQ. ID NOs 17-31. These two aspects of the 
present invention provide the selectivity and sensitivity necessary for the early detection of 
pathologies, such as cancers, for example CRC, lung, prostate, and breast, and 

20 neurodegenerative diseases, for example Alzheimer's and ALS. 

As previously mentioned, CRC is an exemplary pathology contemplated for 
development of novel drugs. For CRC, no biomarker or biomarker panel has been identified 
that has an acceptably high degree of selectivity and sensitivity to be effective for early 
detection of CRC. Therefore, what is described in Fig. 1 are aspects of biomarker panels that 

25 are differentiating in providing the basis for early detection of CRC. Selectivity of a biomarker 
defined clinically refers to percentage of patients correctly diagnosed. Sensitivity of a 
biomarker in a clinical context is defined as the probability that the disease is detected at a 
curable stage. Ideally, biomarkers would have 100% clinical selectivity and 100% clinical 
sensitivity. To date, no biomarker or biomarker panel has been identified that has an 

30 acceptably high degree of selectivity and sensitivity required to be effective for the broad 
range of needs in patient care management. 

The analytical methodology by which the screening is conducted may include the 
methodologies disclosed above for early detection of CRC, i.e. gene expression profiling from 
the mRNA of a biological sample to determine the gene expression of biomarkers and how 

35 their expression level(s) might have been affected by a prospective drug candidate (including 
use of RT-PCR), and/or determining protein expression levels of the Fig. 1 polypeptide 
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biomarkers due to application of a prospective drug candidate; and then applying multivariate 
statistical analysis to determine the statistical significance of the expression levels of the 
various markers in the panel, with and without the prospective drug candidate(s). 

Referring to Fig. 7, one aspect of the drug screening disclosure contemplates 
5 obtaining a tissue sample, such as a swab (see Fig. 6), blood sample, or biopsy, which can 
be taken by, for example, minimally invasive, invasive, or non-invasive means. An 
appropriate lysis buffer can be used to extract and preserve the RNA of the cells in the tissue 
sample. RT-PCR then can be carried out on the extracted RNA and converted to cDNA, as 
disclosed above, using, for example, at least two of the primers listed in SEQ. ID NOs 33-64, 

10 specific to the biomarker panel of Fig. 1, to screen the effect of the drug. The results of the 
assay can then be subjected to a multivariate analysis and M-dist, as disclosed above, and 
the results compared to control data. 

Figure 8 depicts a further aspect of the drug screening disclosure in which antibodies 
are made against at least two biomarker proteins listed as SEQ. ID NOs 17-32, and the 

15 antibodies are used to assay a biological system, for example whole cells, cell lysates, etc. 
from, for example, biopsies or other tissue samples as set forth above. The antibodies are 
used to detect and quantify expression of the biomarker peptides identified by SEQ. ID NOs 
17-32, so that the expression of these biomarker peptides can be monitored as a function of 
dosing the biological system with a potential drug. The results can be subjected to 

20 multivariate or univariate analysis and M-dist., as disclosed above, and compared to control 
data. 

What has been disclosed herein has been provided for the purposes of illustration and 
description. It is not intended to be exhaustive or to limit what is disclosed to the precise 
forms described. Many modifications and variations will be apparent to the practitioner skilled 
25 in the art. What is disclosed was chosen and described in order to best explain the principles 
and practical application of the disclosed embodiments of the art described, thereby enabling 
others skilled in the art to understand the various embodiments and various modifications that 
are suited to the particular use contemplated. 

The references cited above are incorporated by reference in full. 
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CLAIMS 



What is claimed: 

1 . A method for making a reagent composition for the early detection of colorectal 
cancer, lung cancer, prostate cancer, breast cancer, Alzheimer's and ALS, the 
method comprising: 

synthesizing a pair of primers for each polynucleotide pair from SEQ. ID NOs 33-64; 

adjusting to at least one desired concentration in a plurality of separate stock solutions 
each of said primers, using a diluent; 

aliquoting each of said stock solutions of each of said primers into a plurality of 
15 containers; and 

storing the plurality of containers in long-term storage conditions. 

2. The method of claim 1 wherein the method further comprises lyophilizing the aliquoted 
20 stock solutions of each of said primer pairs. 

3. A method for early detection of colorectal cancer, lung cancer, prostate cancer, breast 
cancer, Alzheimer's and ALS, the method comprising: 

25 obtaining a tissue sample by a non-invasive or a minimally invasive method from 

grossly-normal appearing tissue; 

isolating RNA from the sample; 

30 amplifying copies of cDNA from the RNA sample using a plurality of pairs of primers 

selected from the group consisting of SEQ. ID NOs 33-64, to detect a panel of 
polynucleotides selected from SEQ. ID NOs. 1-16; 



35 



quantifying the amplified copies of cDNA; and 
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using the quantified amplified copies of cDNA to assess at least one of disease 
progress and treatment effectiveness for at least one of colorectal cancer, lung 
cancer, prostate cancer, breast cancer, Alzheimer's and ALS. 

5 4. The method as in claim 3 wherein the obtaining step further comprises sampling rectal 

mucosal cells. 



5. The method of claim 3 wherein the obtaining step further comprises one of drawing 
blood, sampling stool, and taking a rectal biopsy. 

10 

6. The method of claim 3 wherein the using step further comprises: 

analyzing by multivariate analysis the quantified levels of tissue sample cDNA; 

15 comparing the multivariate analysis of the quantified levels of tissue sample cDNA 

with a plurality of control data, wherein the comparison determines a significance of 
differences from the control data to assess the presence of colorectal cancer. 

7. The method of claim 6 wherein the analyzing step further comprises using one of an 
20 ANOVA test and a Mahalanobis distance test. 

8. A method for early detection of colorectal cancer and for evaluation of treatment 
efficacy of colorectal cancer, the method comprising the steps of: 

25 obtaining by a non-invasive or minimally-invasive method a tissue sample containing 

cells that grossly appear cancer-free; 

generating a plurality of antibodies having different specificities against each of the 
polypeptides identified by SEQ. ID NOs 17-32; 

30 

assaying for expression of polypeptides in a panel of polypeptides identified by SEQ. 
ID NOs 17-32 with the plurality of antibodies, wherein the assaying step allows for 
quantifying specific binding of the antibodies to the polypeptides; 

35 quantifying the levels of each of the different polypeptides in the panel of polypeptides 

based on the quantified specific antibody binding; and 
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analyzing the quantified levels of each of the different polypeptides in the panel of 
polypeptides, wherein the quantified levels are used to assess at least one of the 
presence, progress, and treatment of colorectal cancer. 

5 

9. The method of claim 8 wherein the obtaining step further comprises one of sampling 
blood, sampling stool, swabbing for colonic cells, and performing a rectal biopsy. 

1 0. A method for analyzing data for the early detection and treatment monitoring of 
10 colorectal cancer, the method comprising the following steps: 

obtaining a plurality of quantified levels of cDNA for polynucleotides selected from 
SEQ. ID Nos. 1-16 from a patient sample, wherein the sample is taken by a non- 
invasive method or a minimally-invasive method; 

15 

comparing said data from the patient sample to a plurality of stored control data using 
multivariate statistical analysis; and 

making a determination concerning one of diagnosis of colorectal cancer, colorectal 
20 cancer progress, and treatment efficacy for the patient based on the comparison. 

1 1. A machine readable medium having instructions stored thereon that, when executed 
by one or more processors, cause a system to: 

25 obtain the data of quantified levels of cDNA for polynucleotides listed in SEQ. ID NOs. 

1-16, wherein the quantified levels of cDNA are from a patient tissue sample and a 
control tissue sample; 

compare the quantified levels of cDNA from the patient tissue sample to the quantified 
30 levels of cDNA from the control tissue sample using at least one multivariate statistical 

analysis; and 

provide said multivariate statistical analysis for evaluation by an individual trained to 
evaluate colorectal cancer. 



35 
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12. A computer signal embodied in a transmission medium, comprising: 

a code segment including instruction for obtaining quantified levels of cDNA for 
polynucleotides selected from SEQ. ID NOs. 1-16, wherein the quantified levels of 
5 cDNA are from a patient tissue sample; 

a code segment including instruction for comparing the quantified levels of cDNA from 
the patient tissue sample to a plurality of control data using multivariate statistical 
analysis; and 

10 

a code segment including instruction for making a diagnosis of colorectal cancer for 
the patient tissue sample based on the comparison. 

13. A computer signal embodied in a transmission medium, comprising: 

15 

a code segment including instruction for obtaining quantified levels of polypeptides 
selected from SEQ. ID NOs. 17-33, wherein the quantified levels of polypeptides are 
from a patient sample containing colonic mucosal cells; 

20 a code segment including instruction for comparing the quantified levels of 

polypeptides from the patient sample to a plurality of control data using multivariate 
statistical analysis; and 

a code segment including at least one instruction based on the comparison for at least 
25 one of a diagnosis of colorectal cancer, a progress of colorectal cancer, and an 

efficacy of treatment of colorectal cancer. 

14. A kit for use in the early detection of colorectal cancer, the kit comprising: 

30 a collection container for receiving a sample containing rectal mucosal cells obtained 

through a non-invasive procedure, wherein the collection container is configured to 
stabilize and store the sample; and 

at least one reagent that is used in the analysis of polynucleotide expression levels, 
35 wherein the polynucleotides are selected from SEQ. ID Nos. 1-16. 
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15. A kit for use in the detection of colorectal cancer, the kit comprising: 

a swab sampling and sample transport system for the minimally invasive sampling of 
rectal mucosal cells, which system is comprised of: 

a swab configured to sample colonic mucosal cells from the rectum; and 

a collection container for receiving the swab after the sample has been taken, 
wherein the collection container is configured to stabilize, extract and store the 
sample; and 

at least one reagent that is used in the analysis of polynucleotide expression levels, 
wherein the polynucleotides are selected from SEQ. ID Nos. 1-16. 

16. A method for drug screening, the method comprising the following steps: 

selecting a model biological system for at least one of colorectal cancer, lung cancer, 
prostate cancer, breast cancers, Alzheimer's and ALS; 

selecting at least one prospective drug for screening using the suitable model 
biological system; 

selecting at least two biomarkers from a panel of biomarkers identified by SEQ. ID 1- 
32; 

dosing the model biological system with the at least one prospective drug; and 

monitoring the response of the at least two biomarkers in the model biological system 
as a function of the dosing step. 

17. The method of claim 16, further comprising: determining the efficacy of the 
prospective drug based on the monitoring step. 
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gtgttgacat 


ccagatcaca 


420 


tttgattgac 


agtccaccaa 


cttacaatgc 


tgactatggc 


tacaaaagct 


gggaagcctt 


480 


ctctaacctc 


tcctattata 


ctagagccct 


tcctcctgtg 


cctgatgatt 


gcccgactcc 


540 


cttgggtgtc 


aaaggtaaaa 


agcagcttcc 


tgattcaaat 


gagattgtgg 


aaaaattgct 


600 


tctaagaaga 


aagttcatcc 


ctgatcccca 


gggctcaaac 


atgatgtttg 


cattctttgc 


660 


ccagcacttc 


acgcatcagt 


ttttcaagac 


agatcataag 


cgagggccag 


ctttcaccaa 


720 


cgggctgggc 


catggggtgg 


acttaaatca 


tatttacggt 


gaaactctgg 


ctagacagcg 


780 


taaactgcgc 


cttttcaagg 


atggaaaaat 


gaaatatcag 


ataattgatg 


gagagatgta 


840 


tcctcccaca 


gtcaaagata 


ctcaggcaga 


gatgatctac 


cctcctcaag 


tccctgagca 


900 


tctacggttt 


gctgtggggc 


aggaggtctt 


tggtctggtg 


cctggtctga 


tgatgtatgc 


960 
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cacaatctgg 


ctgcgggaac 


acaacagagt 


atgcgatgtg 


cttaaacagg 


agcatcctga 


1020 


atggggtgat 


gagcagttgt 


tccagacaag 


caggctaata 


ctgataggag 


agactattaa 


1080 


gattgtgatt 


gaagattatg 


tgcaacactt 


gagtggctat 


cacttcaaac 


tgaaatttga 


1140 


cccagaacta 


cttttcaaca 


aacaattcca 


gtaccaaaat 


cgtattgctg 


ctgaatttaa 


1200 


caccctctat 


cactggcatc 


cccttctgcc 


tgacaccttt 


caaattcatg 


accagaaata 


1260 


caactatcaa 


cagtttatct 


acaacaactc 


tatattgctg 


gaacatggaa 


ttacccagtt 


1320 


tgttgaatca 


ttcaccaggc 


aaattgctgg 


cagggttgct 


ggtggtagga 


atgttccacc 


1380 


cgcagtacag 


aaagtatcac 


aggcttccat 


tgaccagagc 


aggcagatga 


aataccagtc 


1440 


ttttaatgag 


taccgcaaac 


gctttatgct 


gaagccctat 


gaatcatttg 


aagaacttac 


1500 


aggagaaaag 


gaaatgtctg 


cagagttgga 


agcactctat 


ggtgacatcg 


atgctgtgga 


1560 


gctgtatcct 


gcccttctgg 


tagaaaagcc 


tcggccagat 


gccatctttg 


gtgaaaccat 


1620 


ggtagaagtt 


ggagcaccat 


tctccttgaa 


aggacttatg 


ggtaatgtta 


tatgttctcc 


1680 


tgcctactgg 


aagccaagca 


cttttggtgg 


agaagtgggt 


tttcaaatca 


tcaacactgc 


1740 


ctcaattcag 


tctctcatct 


gcaataacgt 


gaagggctgt 


ccctttactt 


cattcagtgt 


1800 


tccagatcca 


gagctcatta 


aaacagtcac 


catcaatgca 


agttcttccc 


gctccggact 


1860 


agatgatatc 


aatcccacag 


tactactaaa 


agaacgttcg 


actgaactgt 


agaagtctaa 


1920 


tgatcatatt 


'tatttattta 


tatgaaccat 


gtctattaat 


ttaattattt 


aataatattt 


1980 


atattaaact 


ccttatgtta 


cttaacatct 


tctgtaacag 


aagtcagtac 


tcctgttgcg 


2040 


gagaaaggag 


tcatacttgt 


gaagactttt 


atgtcactac 


tctaaagatt 


ttgctgttgc 


2100 


tgttaagttt 


ggaaaacagt 


ttttattctg 


ttttataaac 


cagagagaaa 


tgagttttga 


2160 


cgtcttttta 


cttgaatttc 


aacttatatt 


ataagaacga 


aagtaaagat 


gtttgaatac 


2220 


ttaaacactg 


tcacaagatg 


gcaaaatgct 


gaaagttttt 


acactgtcga 


tgtttccaat 


2280 


gcatcttcca 


tgatgcatta 


gaagtaacta 


atgtttgaaa 


ttttaaagta 


cttttggtta 


2340 


tttttctgtc 


atcaaacaaa 


aacaggtatc 


agtgcattat 


taaatgaata 


tttaaattag 


2400 


acattaccag 


taatttcatg 


tctacttttt 


aaaatcagca 


atgaaacaat 


aatttgaaat 


2460 


ttctaaattc 


atagggtaga 


atcacctgta 


aaagcttgtt 


tgatttctta 


aagttattaa 


2520 


acttgtacat 


ataccaaaaa 


gaagctgtct 


tggatttaaa tctgtaaaat 


cagtagaaat 


2580 


tttactacaa 


ttgcttgtta 


aaatatttta 


taagtgatgt 


tcctttttca 


ccaagagtat 


2640 


aaaccttttt 


agtgtgactg 


ttaaaacttc 


cttttaaatc 


aaaatgccaa 


atttattaag 


2700 


gtggtggagc 


cactgcagtg 


ttatcttaaa 


ataagaatat 


tttgttgaga 


tattccagaa 


2760 


tttgtttata 


tggctggtaa 


catgtaaaat 


ctatatcagc 


aaaagggtct 


acctttaaaa 


2820 


taagcaataa 


caaagaagaa 


aaccaaatta 


ttgttcaaat 


ttaggtttaa 


acttttgaag 


2880 
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caaacttttt 


tttatccttg 


tgcactgcag 


gcctggtact 


cagattttgc 


tatgaggtta 


2940 


atgaagtacc 


aagctgtgct 


tgaataatga 


tatgttttct 


cagattttct 


gttgtacagt 


3000 


ttaatttagc 


agtccatatc 


acattgcaaa 


agtagcaatg 


acctcataaa 


atacctcttc 


3060 


aaaatgctta 


aattcatttc 


acacattaat 


tttatctcag 


tcttgaagcc 


aattcagtag 


3120 


gtgcattgga atcaagcctg 


gctacctgca 


tgctgttcct 


tttcttttct 


tcttttagcc 


3180 


attttgctaa gagacacagt 


cttctcatca 


cttcgtttct 


cctattttgt 


tttactagtt 


3240 


ttaagatcag 


agttcacttt 


ctttggactc 


tgcctatatt 


ttcttacctg 


aacttttgca 


3300 


agttttcagg 


taaacctcag 


ctcaggactg 


ctatttagct 


cctcttaaga 


agatta 


3356 


<210> 3 
<211> 1750 
<212> DNA 
<213> HUMAN 












<400> 3 
cctacaggtg 


aaaagcccag 


cgacccagtc 


aggatttaag 


tttacctcaa 


aaatggaaga 


60 


ttttaacatg 


gagagtgaca 


gctttgaaga 


tttctggaaa 


ggtgaagatc 


ttagtaatta 


120 


cagttacagc 


re raccc tgc 


ccccttttct 


actagatgcc 


gccccatgtg 


aaccagaatc 


180 


cctggaaatc 


aacaagtatt 


ttgtggtcat 


tatctatgcc 


ctggtattcc 


tgctgagcct 


240 


gctgggaaac 


tccctcgtga 


tgctggtcat 


cttatacagc 


agggtcggcc 


gctccgtcac 


300 


tgatgtctac 


ctgctgaacc 


tagccttggc 


cgacctactc 


tttgccctga 


ccttgcccat 


*5 f f\ 

360 


ctgggccgcc 


tccaaggtga 


atggctggat 


ttttggcaca 


ttcctgtgca 


aggtggtctc 


420 


actcctgaag 


gaagtcaact 


tctatagtgg 


catcctgcta 


ctggcctgca 


tcagtgtgga 


480 


ccgttacctg 


gccattgtcc 


atgccacacg 


cacactgacc 


cagaagcgct 


acttggtcaa 


540 


attcatatgt 


ctcagcatct 


ggggtctgtc 


cttgctcctg 


gccctgcctg 


tcttactttt 


600 


ccgaaggacc 


gtctactcat 


ccaatgttag 


cccagcctgc 


tatgaggaca 


tgggcaacaa 


660 


tacagcaaac 


tggcggatgc 


tgttacggat 


cctgccccag 


tcctttggct 


tcatcgtgcc 


720 


actgctgatc 


atgctgttct 


gctacggatt 


caccctgcgt 


acgctgttta 


aggcccacat 


780 


ggggcagaag 


caccgggcca 


tgcgggtcat 


ctttgctgtc 


gtcctcatct 


tcctgctttg 


840 


ctggctgccc 


tacaacctgg 


tcctgctggc 


agacaccctc 


atgaggaccc 


aggtgatcca 


900 


ggagacctgt 


gagcgccgca 


atcacatcga 


ccgggctctg 


gatgccaccg 


agattctggg 


960 


catccttcac 


agctgcctca 


accccctcat 


ctacgccttc 


attggccaga 


agtttcgcca 


1020 


tggactcctc 


aagattctag 


ctatacatgg 


cttgatcagc 


aaggactccc 


tgcccaaaga 


1080 


cagcaggcct 


tcctttgttg 


gctcttcttc 


agggcacact 


tccactactc 


tctaagacct 


1140 


cctgcctaag 


tgcagccccg 


tggggttcct 


cccttctctt 


cacagtcaca 


ttccaagcct 


1200 
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catgtccact 


ggttcttctt 


ggtctcagtg 


tcaatgcagc 


ccccattgtg 


gtcacaggaa 


J.ZOU 


gcagaggagg 


ccacgttctt 


actagtttcc 


cttgcatggt 


ttagaaagct 


tgccctggtg 


13ZU 


cctcacccct 


tgccataatt 


actatgtcat 


ttgctggagc 


tctgcccatc 


ctgcccctga 




gcccatggca 


ctctatgttc 


taagaagtga 


aaatctacac 


tccagtgaga 


cagctctgca 


i a a r\ 
144U 


tactcattag 


gatggctagt 


atcaaaagaa 


agaaaatcag 


gctggccaac 


gggatgaaac 


i c c\r\ 
1500 


cctgtctcta 


ctaaaaatac 


aaaaaaaaaa 


aaaaaaatta gccgggcgtg 


gtggtgagtg 


1560 


cctgtaatca 


cagctacttg 


ggaggctgag 


atgggagaat 


cacttgaacc 


cgggaggcag 


1620 


aggttgcagt 


gagccgagat 


tgtgcccctg 


cactccagcc 


tgagcgacag 


tgagactctg 


1680 


tctcagtcca 


tgaagatgta 


gaggagaaac 


tggaactctc 


gagcgttgct 


gggggggatt 


1740 


gtaaaatggt 












1750 


<210> 4 
<211> 3939 
<212> DNA 
<213> HUMAN 












<400> 4 
cctgggtcct 


ctcggcgcca 


gagccgctct 


ccgcatccca ggacagcggt 


gcggccctcg 


bU 


gccggggcgc 


ccactccgca 


gcagccagcg 


agccagctgc 


cccgtatgac 


cgcgccgggc 


1ZU 


gccgccgggc 


gctgccctcc 


cacgacatgg 


ctgggctccc 


tgctgttgtt 


ggtctgtctc 


loU 


ctggcgagca 


ggagtatcac 


cgaggaggtg 


tcggagtact 


gtagccacat 


gattgggagt 


*> a r\ 


ggacacctgc 


agtctctgca 


gcggctgatt 


gacagtcaga tggagacctc 


gtgccaaatt 


iUU 


acatttgagt 


ttgtagacca 


ggaacagttg 


aaagatccag 


tgtgctacct 


taagaaggca 


ioU 


tttctcctgg 


tacaagacat 


aatggaggac 


accatgcgct 


tcagagataa 


caccgccaat 


a *>n 
4^U 


cccatcgcca 


ttgtgcagct 


gcaggaactc 


tctttgaggc 


tgaagagctg 


cttcaccaag 


4oU 


gattatgaag 


agcatgacaa 


ggcctgcgtc 


cgaactttct 


atgagacacc 


tctccagttg 


c a n 
D4U 


ctggagaagg 


tcaagaatgt 


ctttaatgaa 


acaaagaatc 


tccttgacaa 


ggactggaat 


bUU 


attttcagca 


agaactgcaa 


caacagcttt 


gctgaatgct 


ccagccaaga 


tgtggtgacc 


boO 


aagcctgatt 


gcaactgcct 


gtaccccaaa 


gccatcccta gcagtgaccc 


ggcctctgtc 


720 


tcccctcatc 


agcccctcgc 


cccctccatg 


gcccctgtgg 


ctggcttgac 


ctgggaggac 


~j o r\ 

780 


tctgagggaa 


ctgagggcag 


ctccctcttg 


cctggtgagc 


agcccctgca 


cacagtggat 


840 


ccaggcagtg 


ccaagcagcg 


gccacccagg 


agcacctgcc 


agagctttga 


gccgccagag 


900 


accccagttg 


tcaaggacag 


caccatcggt 


ggctcaccac 


agcctcgccc 


ctctgtcggg 


960 


gccttcaacc 


ccgggatgga 


ggatattctt 


gactctgcaa 


tgggcactaa 


ttgggtccca 


1020 


gaagaagcct 


ctggagaggc 


cagtgagatt 


cccgtacccc 


aagggacaga 


gctttccccc 


1080 
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tccaggccag 


gagggggcag 


catgcagaca 


gagcccgcca 


gacccagcaa 


cttcctctca 


1 "1 A f\ 

1140 


gcatcttctc 


cactccctgc 


atcagcaaag 


ggccaacagc 


cggcagatgt 


aactgctaca 


1200 


gccttgccca 


gggtgggccc 


cgtgatgccc 


actggccagg 


actggaatca 


caccccccag 


1260 


aagacagacc 


atccatctgc 


cctgctcaga 


gaccccccgg 


agccaggctc 


tcccaggatc 


1320 


tcatcactgc 


gcccccaggc 


cctcagcaac 


ccctccaccc 


tctctgctca 


gccacagctt 


1380 


tccagaagcc 


actcctcggg 


cagcgtgctg 


ccccttgggg 


agctggaggg 


caggaggagc 


1 A A f\ 

1440 


accagggatc 


ggacgagccc 


cgcagagcca 


gaagcagcac 


cagcaagtga 


aggggcagcc 


1500 


aggcccctgc 


cccgttttaa 


ctccgttcct 


ttgactgaca 


caggccatga 


gaggcagtcc 


1560 


gagggatcct 


ccagcccgca 


gctccaggag 


tctgtcttcc 


acctgctggt 


gcccagtgtc 


1620 


atcctggtct 


tgctggctgt 


cggaggcctc 


ttgttctaca 


ggtggaggcg 


gcggagccat 


1680 


caagagcctc 


agagagcgga 


ttctcccttg 


gagcaaccag 


agggcagccc 


cctgactcag 


1740 


gatgacagac 


aggtggaact 


gccagtgtag 


agggaattct 


aagctggacg 


cacagaacag 


1800 


tctcttcgtg 


ggaggagaca 


ttatggggcg 


tccaccacca 


cccctccctg 


gccatcctcc 


1860 


tggaatgtgg 


tctgccctcc 


accagagctc 


ctgcctgcca 


ggactggacc 


agagcagcca 


1920 


ggctggggcc 


cctctgtctc 


aacccgcaga 


cccttgactg 


aatgagagag 


gccagaggat 


1 o o t\ 

1980 


gctccccatg 


ctgccactat 


ttattgtgag 


ccctggaggc 


tcccatgtgc 


ttgaggaagg 


2040 


ctggtgagcc 


cggctcagga 


ccctcttccc 


tcaggggctg 


cagcctcctc 


tcactccctt 


2100 


ccatgccgga 


acccaggcca 


gggacccacc 


ggcctgtggt 


ttgtgggaaa 


gcagggtgca 


2160 


cgctgaggag 


tgaaacaacc 


ctgcacccag 


agggcctgcc 


tggtgccaag 


gtatcccagc 


2220 


ctggacaggc 


atggacctgt 


ctccagacag 


aggagcctga 


agttcgtggg 


gcgggacagc 


2280 


ctcggcctga 


tttcccgtaa 


aggtgtgcag 


cctgagagac 


gggaagagga 


ggcctctgca 


2340 


cctgctggtc 


tgcactgaca 


gcctgaaggg 


tctacaccct 


cggctcacct 


aagtccctgt 


2400 


gctggttgcc 


aggcccagag 


gggaggccag 


ccctgccctc 


aggacctgcc 


tgacctgcca 


2460 


gtgatgccaa 


gagggggatc 


aagcactggc 


ctctgcccct 


cctccttcca 


gcacctgcca 


2520 


gagcttctcc 


agcaggccaa 


gcagaggctc 


ccctcatgaa 


ggaagccatt 


gcactgtgaa 


2580 


cactgtacct 


gcctgctgaa 


cagcctcccc 


ccgtccatcc 


atgagccagc 


atccgtccgt 


2640 


cctccactct 


ccagcctctc 


cccagcctcc 


tgcactgagc 


tggcctcacc 


agtcgactga 


2700 


gggagcccct 


cagccctgac 


cttctcctga 


cctggccttt 


gactccccgg 


agtggagtgg 


2760 


ggtgggagaa 


cctcctgggc 


cgccagccag 


agccgctctt 


taggctgtgt 


tcttcgccca 


2820 


ggtttctgca 


tcttccactt 


tgacattccc 


aagagggaag 


ggactagtgg 


gagagagcaa 


2880 


gggaggggag 


ggcacagaca 


gagagcctac 


agggcgagct 


ctgactgaag 


atgggccttt 


2940 
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gaaatatagg 


tatgcacctg 


aggttggggg 


agggtctgca 


ctcccaaacc 


ccagcgcagt 


3000 


gtcctttccc 


tgctgccgac 


aggaacctgg 


ggctgagcag 


gttatccctg 


tcaggagccc 


3060 


tggactgggc 


tgcatctcag 


ccccacctgc 


atggtatcca 


gctcccatcc 


acttctcacc 


3120 


cttctttcct 


cctgaccttg 


gtcagcagtg 


atgacctcca 


actctcaccc 


accccctcta 


3180 


ccatcacctc 


taaccaggca 


agccagggtg 


ggagagcaat 


caggagagcc 


aggcctcagc 


3240 


ttccaatgcc 


tggagggcct 


ccactttgtg 


gccagcctgt 


ggtgctggct 


ctgaggccta 


3300 


ggcaacgagc 


gacagggctg 


ccagttgccc 


ctgggttcct 


ttgtgctgct 


gtgtgcctcc 


3360 


tctcctgccg 


ccctttgtcc 


tccgctaaga 


gaccctgccc 


tacctggccg 


ctgggccccg 


3420 


tgactttccc 


ttcctgccca 


ggaaagtgag 


ggtcggctgg 


ccccaccttc 


cctgtcctga 


3480 


tgccgacagc 


ttagggaagg 


gcactgaact 


tgcatatggg 


gcttagcctt 


ctagtcacag 


3540 


cctctatatt 


tgatgctaga 


aaacacatat 


ttttaaatgg 


aagaaaaata 


aaaaggcatt 


3600 


cccccttcat 


ccccctacct 


taaacatata 


atattttaaa 


ggtcaaaaaa 


gcaatccaac 


3660 


ccactgcaga 


agctcttttt 


gagcacttgg 


tggcatcaga 


gcaggaggag 


ccccagagcc 


3720 


acctctggtg 


tcccccaggc 


tacctgctca 


ggaacccctt 


ctgttctctg 


agaactcaac 


3780 


agaggacatt 


ggctcacgca 


ctgtgagatt 


ttgtttttat 


acttgcaact 


ggtgaattat 


3840 


tttttataaa 


gtcatttaaa 


tatctattta 


aaagatagga 


agctgcttat 


atatttaata 


3900 


ataaaagaag 


tgcacaagct 


gccgttgacg 


tagctcgag 






3939 



<210> 5 

<211> 1024 

<212> DNA 

<213> HUMAN 

<400> 5 



atggcccgcg 


ctgctctctc 


cgccgccccc 


agcaatcccc 


ggctcctgcg 


agtggcactg 


60 


ctgctcctgc 


tcctggtagc 


cgctggccgg 


cgcgcagcag 


gagcgtccgt 


ggccactgaa 


120 


ctgcgctgcc 


agtgcttgca 


gaccctgcag 


ggaattcacc 


ccaagaacat 


ccaaagtgtg 


180 


aacgtgaagt 


cccccggacc 


ccactgcgcc 


caaaccgaag 


tcatagccac 


actcaagaat 


240 


gggcggaaag 


cttgcctcaa 


tcctgcatcc 


cccatagtta 


agaaaatcat 


cgaaaagatg 


300 


ctgaacagtg 


acaaatccaa 


ctgaccagaa 


gggaggagga 


agctcactgg 


tggctgttcc 


360 


tgaaggaggc 


cctgccctta 


taggaacaga 


agaggaaaga 


gagacacagc 


tgcagaggcc 


420 


acctggattg 


tgcctaatgt 


gtttgagcat 


cgcttaggag 


aagtcttcta 


tttatttatt 


480 


tattcattag 


ttttgaagat 


tctatgttaa 


tattttaggt 


gtaaaataat 


taagggtatg 


540 


attaactcta 


cctgcacact 


gtcctattat 


attcattctt 


tttgaaatgt 


caaccccaag 


600 


ttagttcaat 


ctggattcat 


atttaatttg 


aaggtagaat 


gttttcaaat 


gttctccagt 


660 
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cattatgtta 


atatttctga 


ggagcctgca 


acatgccagc 


cactgtgata 


gaggctggcg 


720 


gatccaagca 


aatggccaat 


gagatcattg 


tgaaggcagg 


ggaatgtatg 


tgcacatctg 


780 


ttttgtaact 


gtttagatga 


atgtcagttg 


ttatttattg 


aaatgatttc 


acagtgtgtg 


840 


gtcaacattt 


ctcatgttga 


aactttaaga 


actaaaatgt 


tctaaatatc 


ccttggacat 


900 


tttatgtctt 


tcttgtaagg 


catactgcct 


tgtttaatgg 


tagttttaca 


gtgtttctgg 


960 


cttagaacaa 


aggggcttaa 


ttattgatgt 


tttcatagag 


aatataaaaa 


taaagcactt 


1020 


atag 














1024 


<210> 
<211> 
<212> 
<213> 


6 

1064 

DNA 

HUMAN 












<220> 
<221> 
<222> 
<223> 


misc_feature 

(27).. (27) 

n = a, c, g, t 












<220> 
<221> 
<222> 
<223> 


misc_feature 
(766) . . (766) 
n = a, c, g, t 












<400> 6 
cacagccggg 


tcgcaggcac 


ctccccngcc 


agctctcccg 


cattctgcac 


agcttcccga 


60 


cgcgtctgct 


gagccccatg 


gcccacgcca 


cgctctccgc 


cgcccccagc 


aatccccggc 


120 


tcctgcgggt 


ggcgctgctg 


ctcctgctcc 


tggtgggcag 


ccggcgcgca 


gcaggagcgt 


180 


ccgtggtcac 


tgaactgcgc 


tgccagtgct 


tgcagacact 


gcagggaatt 


cacctcaaga 


240 


acatccaaag 


tgtgaatgta 


aggtcccccg 


gaccccactg 


cgcccaaacc 


gaagtcatag 


300 


ccacactcaa 


gaatgggaag 


aaagcttgtc 


tcaaccccgc 


atcccccatg 


gttcagaaaa 


360 


tcatcgaaaa 


gatactgaac 


aaggggagca 


ccaactgaca 


ggagagaagt 


aagaagctta 


420 


tcagcgtatc 


attgacactt 


cctgcagggt 


ggtccctgcc 


cttaccagag 


ctgaaaatga 


480 


aaaagagaac 


agcagctttc 


tagggacagc 


tggaaaggga 


cttaatgtgt 


ttgactattt 


540 


cttacgaggg 


ttctacttat 


ttatgtattt 


atttttgaaa gcttgtattt 


taatatttta 


600 


catgctgtta 


tttaaagatg 


tgagtgtgtt 


tcatcaaaca 


tagctcagtc 


ctgattattt 


660 


aattggaata 


tgatgggttt 


taaatgtgtc 


attaaactaa 


tatttagtgg 


gagaccataa 


720 


tgtgtcagcc 


accttgataa 


atgacagggt 


ggggaactgg 


agggtngggg 


gattgaaatg 


780 


caagcaatta 


gtggatcact 


gttagggtaa 


gggaatgtat 


gtacacatct 


attttttata 


840 


cttttttttt 


taaaaaagaa 


tgtcagttgt 


tatttattca 


aattatctca 


cattatgtgt 


900 


tcaacatttt 


tatgctgaag 


tttcccttag 


acattttatg 


tcttgcttgt 


agggcataat 


960 
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gccttgttta 


atgtccattc 


tgcagcgttt 


ctctttccct 


tggaaaagag 


aatttatcat 


1020 


tactgttaca 


tttgtacaaa 


tgacatgata 


ataaaagttt 


tatg 




1064 


<210> 7 
<211> 1469 
<212> DNA 
<213> HUMAN 












<400> 7 
agcagcagga 


ggaggcagag 


cacagcatcg 


tcgggaccag 


actcgtctca 


ggccagttgc 


60 


agccttctca 


gccaaacgcc 


gaccaaggaa 


aactcactac 


catgagaatt 


gcagtgattt 


120 


gcttttgcct 


cctaggcatc 


acctgtgcca 


taccagttaa 


acaggctgat 


tctggaagtt 


180 


ctgaggaaaa 


gcagctttac 


aacaaatacc 


cagatgctgt 


ggccacatgg 


ctaaaccctg 


240 


acccatctca 


gaagcagaat 


ctcctagccc 


cacagaccct 


tccaagtaag 


tccaacgaaa 


300 


gccatgacca 


catggatgat 


atggatgatg 


aagatgatga tgaccatgtg 


gacagccagg 


360 


actccattga 


ctcgaacgac 


tctgatgatg 


tagatgacac 


tgatgattct 


caccagtctg 


420 


atgagtctca 


ccattcxgat 


gaatctgatg 


aactggtcac 


tgattttccc 


acggacctgc 


480 


cagcaaccga 


agttttcact 


ccagttgtcc 


ccacagtaga 


cacatatgat 


ggccgaggtg 


540 


atagtgtggt 


ttatggactg 


aggtcaaaat 


ctaagaagtt 


tcgcagacct 


gacatccagt 


600 


accctgatgc 


tacagacgag 


gacatcacct 


cacacatgga 


aagcgaggag 


ttgaatggtg 


660 


catacaaggc 


catccccgtt 


gcccaggacc 


tgaacgcgcc 


ttctgattgg 


gacagccgtg 


720 


ggaaggacag 


ttatgaaacg 


agtcagctgg 


atgaccagag 


tgctgaaacc 


cacagccaca 


780 


agcagtccag 


attatataag 


cggaaagcca 


atgatgagag 


caatgagcat 


tccgatgtga 


840 


ttgatagtca 


ggaactttcc 


aaagtcagcc 


gtgaattcca 


cagccatgaa 


tttcacagcc 


900 


atgaagatat 


gctggttgta 


gaccccaaaa 


gtaaggaaga 


agataaacac 


ctgaaatttc 


960 


gtatttctca 


tgaattagat 


agtgcatctt 


ctgaggtcaa 


ttaaaaggag 


aaaaaataca 


1020 


atttctcact 


ttgcatttag 


tcaaaagaaa 


aaatgcttta 


tagcaaaatg 


aaagagaaca 


1080 


tgaaatgctt 


ctttctcagt 


ttattggttg 


aatgtgtatc 


tatttgagtc 


tggaaataac 


1140 


taatgtgttt 


gataattagt 


ttagtttgtg 


gcttcatgga 


aactccctgt 


aaactaaaag 


1200 


cttcagggtt 


atgtctatgt 


tcattctata 


gaagaaatgc 


aaactatcac 


tgtattttaa 


1260 


tatttgttat 


tctctcatga 


atagaaattt 


atgtagaagc 


aaacaaaata 


cttttaccca 


1320 


cttaaaaaga 


gaatataaca 


ttttatgtca 


ctataatctt 


ttgtttttta 


agttagtgta 


1380 


tattttgttg 


tgattatctt 


tttgtggtgt 


gaataaatct 


tttatcttga 


atgtaataag 


1440 


aaaaaaaaaa 


aaaaaacaaa 


aaaaaaaaa 








1469 
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<210> 8 
<211> 1256 
<212> DNA 
<213> HUMAN 

<400> 8 

gcagtagcag cgagcagcag agtccgcacg ctccggcgag gggcagaaga gcgcgaggga 60 

gcgcggggca gcagaagcga gagccgagcg cggacccagc caggacccac agccctcccc 120 

agctgcccag gaagagcccc agccatggaa caccagctcc tgtgctgcga agtggaaacc 180 

atccgccgcg cgtaccccga tgccaacctc ctcaacgacc gggtgctgcg ggccatgctg 240 

aaggcggagg agacctgcgc gccctcggtg tcctacttca aatgtgtgca gaaggaggtc 300 

ctgccgtcca tgcggaagat cgtcgccacc tggatgctgg aggtctgcga ggaacagaag 360 

tgcgaggagg aggtcttccc gctggccatg aactacctgg accgcttcct gtcgctggag 420 

cccgtgaaaa agagccgcct gcagctgctg ggggccactt gcatgttcgt ggcctctaag 480 

atgaaggaga ccatccccct gacggccgag aagctgtgca tctacaccga cggctccatc 540 

cggcccgagg agctgctgca aatggagctg ctcctggtga acaagctcaa gtggaacctg 600 

gccgcaatga ccccgcacga tttcattgaa cacttcctct ccaaaatgcc agaggcggag 660 

gagaacaaac agatcatccg caaacacgcg cagaccttcg ttgcctcttg tgccacagat 720 

gtgaagttca tttccaatcc gccctccatg gtggcagcgg ggagcgtggt ggccgcagtg 780 

caaggcctga acctgaggag ccccaacaac ttcctgtcct actaccgcct cacacgcttc 840 

ctctccagag tgatcaagtg tgacccagac tgcctccggg cctgccagga gcagatcgaa 900 

gccctgctgg agtcaagcct gcgccaggcc cagcagaaca tggaccccaa ggccgccgag 960 

gaggaggaag aggaggagga ggaggtggac ctggcttgca cacccaccga cgtgcgggac 1020 

gtggacatct gaggggccca ggcaggcggg cgccaccgcc acccgcagcg agggcggagc 1080 

cggccccagg tgctccacat gacagtccct cctctccgga gcattttgat accagaaggg 1140 

aaagcttcat tctccttgtt gttggttgtt ttttcctttg ctctttcccc cttccatctc 1200 

tgacttaagc aaaagaaaaa gattacccaa aaactgtctt taaaagagag agagag 1256 

<210> 9 
<211> 2121 
<212> DNA 
<213> HUMAN 

<400> 9 

ctgctcgcgg ccgccaccgc cgggccccgg ccgtccctgg ctcccctcct gcctcgagaa 60 

gggcagggct tctcagaggc ttggcgggaa aaaagaacgg agggagggat cgcgctgagt 120 

ataaaagccg gttttcgggg ctttatctaa ctcgctgtag taattccagc gagaggcaga 180 

gggagcgagc gggcggccgg ctagggtgga agagccgggc gagcagagct gcgctgcggg 240 

10 
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cgtcctggga agggagatcc 


ggagcgaata 


gggggcttcg 


cctctggccc 


agccctcccg 


300 


cttgatcccc 


caggccagcg 


gtccgcaacc 


cttgccgcat 


ccacgaaact 


ttgcccatag 


360 


cagcgggcgg 


gcactttgca 


ctggaactta 


caacacccga 


gcaaggacgc 


gactctcccg 


420 


acgcggggag 


gctattctgc 


ccatttgggg 


acacttcccc 


gccgctgcca 


ggacccgctt 


480 


ctctgaaagg 


ctctccttgc 


agctgcttag 


acgctggatt 


tttttcgggt 


agtggaaaac 


540 


cagcagcctc 


ccgcgacgat 


gcccctcaac 


gttagcttca 


ccaacaggaa 


ctatgacctc 


600 


gactacgact 


cggtgcagcc 


gtatttctac 


tgcgacgagg 


aggagaactt 


ctaccagcag 


660 


cagcagcaga 


gcgagctgca 


gcccccggcg 


cccagcgagg 


atatctggaa 


gaaattcgag 


720 


ctgctgccca 


ccccgcccct 


gtcccctagc 


cgccgctccg 


ggctctgctc 


gccctcctac 


780 


gttgcggtca 


cacccttctc 


ccttcgggga 


gacaacgacg 


gcggtggcgg 


gagcttctcc 


840 


acggccgacc 


agctggagat 


ggtgaccgag 


ctgctgggag 


gagacatggt 


gaaccagagt 


900 


ttcatctgcg 


acccggacga 


cgagaccttc 


atcaaaaaca 


tcatcatcca 


ggactgtatg 


960 


tggagcggct 


tctcggccgc 


cgccaagctc 


gtctcagaga 


agctggcctc 


ctaccaggct 


1020 


gcgcgcaaag 


acagcggcag 


cccgaacccc 


gcccgcggcc 


acagcgtctg 


ctccacctcc 


1080 


agcttgtacc 


tgcaggatct 


gagcgccgcc 


gcctcagagt 


gcatcgaccc 


ctcggtggtc 


1140 


ttcccctacc 


ctctcaacga 


cagcagctcg 


cccaagtcct 


gcgcctcgca 


agactccagc 


1200 


gccttctctc 


cgtcctcgga 


ttctctgctc 


tcctcgacgg 


agtcctcccc 


gcagggcagc 


1260 


cccgagcccc 


tggtgctcca 


tgaggagaca 


ccgcccacca 


ccagcagcga 


ctctgaggag 


1320 


gaacaagaag 


atgaggaaga 


aatcgatgtt 


gtttctgtgg 


aaaagaggca 


ggctcctggc 


1380 


aaaaggtcag 


agtctggatc 


accttctgct ggaggccaca gcaaacctcc 


tcacagccca 


1440 


ctggtcctca 


agaggtgcca 


cgtctccaca 


catcagcaca 


actacgcagc 


gcctccctcc 


1500 


actcggaagg 


actatcctgc 


tgccaagagg 


gtcaagttgg 


acagtgtcag 


agtcctgaga 


1560 


cagatcagca 


acaaccgaaa 


atgcaccagc 


cccaggtcct 


cggacaccga 


ggagaatgtc 


1620 


aagaggcgaa 


cacacaacgt 


cttggagcgc 


cagaggagga 


acgagctaaa 


acggagcttt 


1680 


tttgccctgc 


gtgaccagat 


cccggagttg 


gaaaacaatg 


aaaaggcccc 


caaggtagtt 


1740 


atccttaaaa 


aagccacagc 


atacatcctg 


tccgtccaag 


cagaggagca 


aaagctcatt 


1800 


tctgaagagg 


acttgttgcg 


gaaacgacga 


gaacagttga 


aacacaaact 


tgaacagcta 


1860 


cggaactctt 


gtgcgtaagg 


aaaagtaagg 


aaaacgattc 


cttctaacag 


aaatgtcctg 


1920 


agcaatcacc 


tatgaacttg 


tttcaaatgc 


atgatcaaat 


gcaacctcac 


aaccttggct 


1980 


gagtcttgag 


actgaaagat 


ttagccataa 


tgtaaactgc 


ctcaaattgg 


actttgggca 


2040 


taaaagaact 


tttttatgct 


taccatcttt 


tttttttctt 


taacagattt 


gtatttaaga 


2100 


attgttttta 


aaaaatttta 


a 








2121 
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<210> 10 
<211> 2098 
<212> DNA 
<213> HUMAN 



<400> 10 
cctgccgaag 


tcagttcctt 


gtggagccgg 


agctgggcgc 


ggattcgccg 


aggcaccgag 


60 


gcactcagag 


gaggcgccat 


gtcagaaccg 


gctggggatg 


tccgtcagaa 


cccatgcggc 


120 


agcaaggcct 


gccgccgcct 


cttcggccca 


gtggacagcg 


agcagctgag 


ccgcgactgt 


180 


gatgcgctaa 


tggcgggctg 


catccaggag 


gcccgtgagc 


gatggaactt 


cgactttgtc 


240 


accgagacac 


cactggaggg 


tgacttcgcc 


tgggagcgtg 


tgcggggcct 


tggcctgccc 


300 


aagctctacc 


ttcccacggg 


gccccggcga 


ggccgggatg 


agttgggagg 


aggcaggcgg 


360 


cctggcacct 


cacctgctct 


gctgcagggg 


acagcagagg 


aagaccatgt 


ggacctgtca 


420 


ctgtcttgta 


cccttgtgcc 


tcgctcaggg 


gagcaggctg 


aagggtcccc 


aggtggacct 


480 


ggagactctc 


agggtcgaaa 


acggcggcag 


accagcatga 


cagatttcta 


ccactccaaa 


540 


cgccggctga 


tcttctccaa 


gaggaagccc 


taatccgccc 


acaggaagcc 


tgcagtcctg 


600 


gaagcgcgag 


ggcctcaaag 


gcccgctcta 


catcttctgc 


cttagtctca 


gtttgtgtgt 


660 


cttaattatt 


atttgtgttt 


taatttaaac 


acctcctcat 


gtacataccc 


tggccgcccc 


720 


ctgcccccca 


gcctctggca 


ttagaattat 


ttaaacaaaa 


actaggcggt 


tgaatgagag 


780 


gttcctaaga 


gtgctgggca 


tttttatttt 


atgaaatact 


atttaaagcc 


tcctcatccc 


840 


gtgttctcct 


tttcctctct 


cccggaggtt 


gggtgggccg 


gcttcatgcc 


agctacttcc 


900 


tcctccccac 


ttgtccgctg 


ggtggtaccc 


tctggagggg 


tgtggctcct 


tcccatcgct 


960 


gtcacaggcg 


gttatgaaat 


tcaccccctt 


tcctggacac 


tcagacctga 


attctttttc 


1020 


atttgagaag 


taaacagatg 


gcactttgaa 


ggggcctcac 


cgagtggggg 


catcatcaaa 


1080 


aactttggag 


tcccctcacc 


tcctctaagg 


ttgggcaggg 


tgaccctgaa 


gtgagcacag 


1140 


cctagggctg 


agctggggac 


ctggtaccct 


cctggctctt 


gatacccccc 


tctgtcttgt 


1200 


gaaggcaggg 


ggaaggtggg 


gtcctggagc 


agaccacccc 


gcctgccctc 


atggcccctc 


1260 


tgacctgcac 


tggggagccc 


gtctcagtgt 


tgagcctttt 


ccctctttgg 


ctcccctgta 


1320 


ccttttgagg 


agccccagct 


acccttcttc 


tccagctggg 


ctctgcaatt 


cccctctgct 


1380 


gctgtccctc 


ccccttgtcc 


tttcccttca 


gtaccctctc 


agctccaggt 


ggctctgagg 


1440 


tgcctgtccc 


acccccaccc 


ccagctcaat 


ggactggaag 


gggaagggac 


acacaagaag 


1500 


aagggcaccc 


tagttctacc 


tcaggcagct 


caagcagcga 


ccgccccctc 


ctctagctgt 


1560 


gggggtgagg 


gtcccatgtg 


gtggcacagg 


cccccttgag 


tggggttatc 


tctgtgttag 


1620 


gggtatatga 


tgggggagta 


gatctttcta 


ggagggagac 


actggcccct 


caaatcgtcc 


1680 
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agcgaccttc 


ctcatccacc 


ccatccctcc 


ccagttcatt 


gcactttgat: 


ragcagcgga 


1 7AC\ 
J./ *tU 


acaaggagtc 


agacatttta 


agatggtggc 


agtagaggct 


atggacaggg 


ca rgccacg x 


lOUU 


gggctcatat 


qqqqctqqqa 


gtagttgtct 


ttcctggcac 


taacgttgag 


cccctggagg 


1860 


cactgaagtg 


cttagtgtac 


ttggagtatt 


ggggtctgac 


cccaaacacc 


ttccagctcc 


1920 


tgtaacatac 


tggcctggac 


tgttttctct 


cggctcccca 


tgtgtcctgg 


ttcccgtttc 


1980 


tccacctaga 


ctgtaaacct 


ctcgagggca 


gggaccacac 


cctgtactgt 


tctgtgtctt 


2040 


tcacagctcc 


tcccacaatg 


ctgatataca 


gcaggtgctc 


aataaacgat 


tcttagtg 


2098 


<210> 11 , 
<211> 1850 
<212> DNA 
<213> HUMAN 












<400> 11 

r\ t~\ e~ r~ f~ f~j /"i /~ +■ 

g g LLLdy y li 


yddLjLLLayy 


gccctgtctg 


ctctgtggac 


tcaacagttt 


gtggcaagac 


oU 


dciy C LL.d.y dd 


r1*fianaanr1" 
t_Lya.ycici.yL. l 


gtcaccacag 


ttctggaggc 


tgggaagttc 


aaga tcaaag 


xzu 


Lyi_L.ciy(wCiyci 


L LLdy Ly LCct 


tgtgaggacg 


tgcttcctgc 


ttcatagata 


agagcttgga 


loll 


f*y f~ f~ r\ t~\ f~ f*\ f "Zl. 

gc xcggcgca 


i_cictL,L,ctyt_clL, 


catctggtcg 


cgatggtgga 


cacggaaagc 


ccactctgcc 




CCCICLLtLL 


dL LL.yctyyiw\_ 


ggcgatctag 


agagcccgtt 


atctgaagag 


rrccrgcaag 




aaa ugggaaa 


CdLCLddydy 


atttcgcaat 


ccatcggcga ggatagttct 


ggaagctttg 




gctttacgga 


ataccagtat 


ttaggaagct 


gtcctggctc 


agatggctcg 


gtcatcacgg 


4ZU 


acacgctttc 


accagcttcg 


agcccctcct 


cggtgactta 


tcctgtggtc 


cccggcagcg 


/ion 


tggacgagtc 


tcccagtgga 


gcattgaaca 


tcgaatgtag 


aatctgcggg 


gacaaggccr 




caggctatca 


ttacggagtc 


cacgcgtgtg 


aaggctgcaa 


gggcttcttt 


cggcgaacga 


Aon 


ttcgactcaa 


gctggtgtat 


gacaagtgcg 


accgcagctg 


caagatccag 


aaaaagaaca 


DDU 


gaaacaaatg 


ccagtattgt 


cgatttcaca 


agtgcctttc 


tgtcgggatg 


tcacacaacg 


7*?n 


cgattcgttt 


tggacgaatg 


ccaagatctg 


agaaagcaaa 


actgaaagca 


^ ~i — i HH HH _r™ HH HH 

gaaa u tc l ta 


/ ou 


cctgtgaaca 


tgacatagaa 


gattctgaaa 


ctgcagatct 


caaatctctg 


gccaagagaa 




tctacgaggc 


ctacttgaag 


aacttcaaca 


tgaacaaggt 


caaagcccgg 


gtcatcctct 


yuu 


caggaaaggc 


cagtaacaat 


ccaccttttg 


tcatacatga 


tatggagaca 


ctgtgtatgg 


yt>u 


ctgagaagac 


gctggtggcc 


aagctggtgg 


ccaatggcat 


ccagaacaag 


gaggcggagg 


1020 


tccgcatctt 


tcactgctgc 


cagtgcacgt 


cagtggagac 


cgtcacggag 


ctcacggaat 


1080 


tcgccaaggc 


catcccaggc 


ttcgcaaact 


tggacctgaa 


cgatcaagtg 


acattgctaa 


1140 


aatacggagt 


ttatgaggcc 


atattcgcca 


tgctgtcttc 


tgtgatgaac 


aaagacggga 


1200 


tgctggtagc 


gtatggaaat 


gggtttataa 


ctcgtgaatt 


cctaaaaagc 


ctaaggaaac 


1260 
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cgttctgtga 


tatcatggaa 


cccaagtttg 


attttgccat 


gaagttcaat 


gcactggaac 


i 3">n 


tggatgacag 


tgatatctcc 


ctttttgtgg 


ctgctatcat 


ttgctgtgga 


gatcgtcctg 


IdoU 


gccttctaaa 


cgtaggacac 


attgaaaaaa 


tgcaggaggg 


tattgtacat 


gtgctcagac 


1 A A f\ 

144U 


tccacctgca 


gagcaaccac 


ccggacgata 


tctttctctt 


cccaaaactt 


cttcaaaaaa 


i c r\r\ 

1500 


tggcagacct 


ccggcagctg 


gtgacggagc 


atgcgcagct 


ggtgcagatc 


atcaagaaga 


1560 


caaaatcaaa 
^yy^y LV *yy u 


tactqccictQ 


cacccgctac 


tgcaggagat 


ctacagggac 


atgtactgag 


1620 


ttccttcaga 


tcagccacac 


cttttccagg 


agttctgaag 


ctgacagcac 


tacaaaggag 


1680 


acgggggagc 


agcacgattt 


tgcacaaata 


tccaccactt 


taaccttaga 


gcttggacag 


1740 


tctgagctgt 


aggtaaccgg 


catattattc 


catatctttg 


ttttaaccag 


tacttctaag 


1800 


agcatagaac 


tcaaatgctg 


ggggaggtgg 


ctaatctcag 


gactgggaag 




1850 


<210> 12 
<211> 1609 
<212> DNA 
<213> HUMAN 












<400> 12 
ttcaagtctt 


tttcttttaa 


cggattgatc 


ttttgctaga 


tagagacaaa 


atatcagtgt 


bU 


gaattacagc 


aaacccctat 


tccatgctgt 


tatgggtgaa 


actctgggag 


attctcctat 


1Z0 


tgacccagaa 


agcgattcct 


tcactgatac 


actgtctgca 


aacatatcac 


aagaaatgac 


180 


catggttgac 


acagagatgc 


cattctggcc 


caccaacttt 


gggatcagct 


ccgtggatct 


240 


ctccgtaatg 


gaagaccact 


cccactcctt 


tgatatcaag 


cccttcacta 


ctgttgactt 


300 


ctccagcatt 


tctactccac 


attacgaaga 


cattccattc 


acaagaacag 


atccagtggt 


3b0 


tgcagattac 


aagtatgacc 


tgaaacttca 


agagtaccaa 


agtgcaatca 


aagtggagcc 


4ZU 


tgcatctcca 


ccttattatt 


ctgagaagac 


tcagctctac 


aataagcctc 


atgaagagcc 


a on 
4oU 


ttccaactcc 


ctcatggcaa 


ttgaatgtcg 


tgtctgtgga 


gataaagctt 


ctggatttca 


c a n 
54U 


ctatggagtt 


catgcttgtg 


aaggatgcaa 


gggtttcttc 


cggagaacaa 


tcagattgaa 


bUU 


gcttatctat 


gacagatgtg 


atcttaactg 


tcggatccac 


aaaaaaagta 


gaaataaatg 


bbO 


tcagtactgt 


cggtttcaga 


aatgccttgc 


agtggggatg 


tctcataatg 


ccatcaggtt 


/ £V 


tgggcggatg 


ccacaggccg 


agaaggagaa 


gctgttggcg 


gagatctcca 


gtgatatcga 


780 


ccagctgaat 


ccagagtccg 


ctgacctccg 


ggccctggca 


aaacatttgt 


atgactcata 


840 


cataaagtcc 


ttcccgctga 


ccaaagcaaa 


ggcgagggcg 


atcttgacag 


gaaagacaac 


900 


agacaaatca 


ccattcgtta 


tctatgacat 


gaattcctta 


atgatgggag 


aagataaaat 


960 


caagttcaaa 


cacatcaccc 


ccctgcagga 


gcagagcaaa 


gaggtggcca 


tccgcatctt 


1020 


tcagggctgc 


cagtttcgct 


ccgtggaggc 


tgtgcaggag 


atcacagagt 


atgccaaaag 


1080 



14 



WO 2006/039405 



PCT/US2005/035027 



NLEE01001WO0 . ST2 5 . txt 



cattcctggt 


tttgtaaatc 


ttgacttgaa 


cgaccaagta 


actctcctca 


aa xauggag l 


1 1 AO 


ccacgagatc 


atttacacaa 


tgctggcctc 


cttgatgaat 


aaagatgggg 


^— ^ s — — * -ft- -» +- *~ 

l tc tcatatc 




cgagggccaa 


ggcttcatga 


caagggagtt 


tctaaagagc 


ctgcgaaagc 


y- -ft- -4— -ft- y-» S-4 -ft- M — » 

ctt ttggtga 


l^DU 


ctttatggag 


cccaagtttg 


agtttgctgt 


gaagttcaat 


gcactggaat 


tagatgacag 




cgacttggca atatttattg 


ctgtcattat 


tctcagtgga 


gaccgcccag 


gtttgctgaa 


1380 


tgtgaagccc 


attgaagaca 


ttcaagacaa 


cctgctacaa 


gccctggagc 


tccagctgaa 


1440 


gctgaaccac 


cctgagtcct 


cacagctgtt 


tgccaagctg 


ctccagaaaa 


tgacagacct 


1500 


cagacagatt 


gtcacggaac 


acgtgcagct 


actqcaqqtq 


atcaagaaga 


cggagacaga 


1560 


catgagtctt 


cacccgctcc 


tgcaggagat 


ctacaaggac 


ttgtactag 




1609 


<210> 13 
<211> 3301 
<212> DNA 
<213> HUMAN 












<220> 

<221> mi sc_f eature 
<222> (2966). .(2973) 
<223> n = a, c, g, t 












<400> 13 
gaattctgcg 


gagcctgcgg 


gacggcggcg 


ggttggcccg 


taggcagccg 


ggacagtgtit 


OU 


gtacagtgtt 


ttgggcatgc 


acgtgatact 


cacacagtgg 


cttctgctca 


ccaacagatg 




aagacagatg 


caccaacgag 


ggtctggaat 


ggtctggagt 


ggtctggaaa 


gcagggtcag 


loU 


atacccctgg 


aaaactgaag 


cccgtggagc 


aatgatctct 


acaggactgc 


ttcaaggctg 


~> A f\ 


atgggaacca 


ccctgtagag 


gtccatctgc 


gttcagaccc 


agacgatgcc 


agagctatga 




ctgggcctgc 


aggtgtggcg 


ccgaggggag 


atcagccatg 


gagcagccac 


aggaggaagc 


3 cn 
3bU 


ccctgaggtc 


cgggaagagg 


aggagaaaga 


ggaagtggca 


gaggcagaag 


gagccccaga 


Aon 


gctcaatggg 


ggaccacagc 


atgcacttcc 


ttccagcagc 


tacacagacc 


tctcccggag 


AQCi 


ctcctcgcca 


ccctcactgc 


tggaccaact 


gcagatgggc 


tgtgacgggg 


cc tcatgcgg 




cagcctcaac 


atggagtgcc 


gggtgtgcgg 


ggacaaggca 


tcgggcttcc 


ac tacgg zq l 


DUU 


tcatgcatgt 


gaggggtgca 


agggcttctt 


ccgtcgtacg 


atccgcatga 


agctggagta 


CCA 


cgagaagtgt 


gagcgcagct 


gcaagattca 


gaagaagaac 


cgcaacaagt 


gccagtactg 


720 


ccgcttccag 


aagtgcctgg 


cactgggcat 


gtcacacaac 


gctatccgtt 


ttggtcggat 


780 


gccggaggct 


gagaagagga 


agctggtggc 


agggctgact 


gcaaacgagg 


ggagccagta 


840 


caacccacag 


gtggccgacc 


tgaaggcctt 


ctccaagcac 


atctacaatg 


cctacctgaa 


900 


aaacttcaac 


atgaccaaaa 


agaaggcccg 


cagcatcctc 


accggcaaag 


ccagccacac 


960 
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ggcgcccttt 


gtgatccacg 


acatcgagac 


attgtggcag 


gcagagaagg 


ggctggtgtg 


1020 


gaagcagttg 


gtgaatggcc 


tgcctcccta 


caaggagatc 


agcgtgcacg 


tcttctaccg 


1080 


ctgccagtgc 


accacagtgg 


agaccgtgcg 


ggagctcact 


gagttcgcca 


agagcatccc 


i *i a n 

1140 


cagcttcagc 


agcctcttcc 


tcaacgacca 


ggttaccctt 


ctcaagtatg 


gcgtgcacga 


1200 


ggccatcttc 


gccatgctgg 


cctctatcgt 


caacaaggac 


gggctgctgg 


tagccaacgg 


1260 


cagtggcttt 


gtcacccgtg 


agttcctgcg 


cagcctccgc 


aaacccttca 


gtgatatcat 


1320 


tgagcctaag 


tttgaatttg 


ctgtcaagtt 


caacgccctg 


gaacttgatg 


acagtgacct 


1380 


ggccctattc 


attgcggcca 


tcattctgtg 


tggagaccgg 


ccaggcctca 


tgaacgttcc 


1 A A f\ 

1440 


acgggtggag 


gctatccagg 


acaccatcct 


gcgtgccctc 


gaattccacc 


tgcaggccaa 


1500 


ccaccctgat 


gcccagtacc 


tcttccccaa 


gctgctgcag 


aagatggctg 


acctgcggca 


1560 


actggtcacc 


gagcacgccc 


agatgatgca 


gcggatcaag 


aagaccgaaa 


ccgagacctc 


1620 


gctgcaccct 


ctgctccagg 


agatctacaa 


ggacatgtac taacggcggc 


acccaggcct 


1680 


ccctgcagac 


tccaatgggg 


ccagcactgg 


aggggcccac 


CCaCaigdLL 


tttccattga 


1740 


ccagctctct 


tcctgtcttt 


gttgtctccc 


lc l xrc rcag 


I ICCICT.L LC 


ttttctaatt 


1800 


cctgttgctc 


tgtttcttcc 


tttctgtagg 


tttctctctt 


cccttctccc 


ttctcccttg 


i o /~ r\ 

1860 


ccctcccttt 


ctctctccta 


tccccacgtc 


tgtcctcctt 


tcttattctg 


tgagatgttt 


1920 


tgtattattt 


caccagcagc 


atagaacagg 


acctctgctt 


ttgcacacct 


tttccccagg 


1980 


agcagaagag 


agtgggcctg 


ccctctgccc 


catcattgca 


cctgcaggct 


taggtcctca 


2040 


cttctgtctc 


ctgtcttcag 


agcaaaagac 


ttgagccatc 


caaagaaaca 


ctaagctctc 


2100 


tgggcctggg 


ttccagggaa 


ggctaagcat 


ggcctggact 


gactgcagcc 


ccctatagtc 


2160 


atggggtccc 


tgctgcaaag 


gacagtggca 


gaccccggca 


gtagagccga 


gatgcctccc 


2220 


caagactgtc 


attgcccctc 


cgatcgtgag 


gccacccact 


gacccaatga 


tcctctccag 


2280 


cagcacacct 


cagccccact 


gacacccagt 


gtccttccat 


cttcacactg 


gtttgccagg 


2340 


ccaatgttgc 


tgatggcccc 


tccagcacac 


acacataagc 


actgaaatca 


ctttacctgc 


-> a r\r\ 

2400 


aggcaccatg 


cacctccctt 


ccctccctga 


ggcaggtgag 


aacccagaga 


gaggggcctg 


2460 


caggtgagca 


ggcagggctg 


ggccaggtct 


ccggggaggc 


aggggtcctg 


caggtcctgg 


2520 


tgggtcagcc 


cagcacctcg 


cccagtggga 


gcttcccggg 


ataaactgag 


cctgttcatt 


2580 


ctgatgtcca 


tttgtcccaa 


tagctctact 


gccctcccct 


tcccctttac 


tcagcccagc 


2640 


tggccaccta 


gaagtctccc 


tgcacagcct 


ctagtgtccg 


gggaccttgt 


gggaccagtc 


2700 


ccacaccgct 


ggtccctgcc 


ctcccctgct 


cccaggttga 


ggtgcgctca 


cctcagagca 


2760 


gggccaaagc 


acagctgggc 


atgccatgtc 


tgagcggcgc 


agagccctcc 


aggcctgcag 


2820 
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gggcaagggg 


ctggctggag 


tctcagagca 


cagaggtagg 


agaactgggg 


ttcaagccca 


2880 


ggcttcctgg 


gtcctgcctg 


gtcctccctc 


ccaaggagcc 


attctatgtg 


actctgggtg 


2940 


gaagtgccca 


gcccctgcct 


gacggnnnnn 


nngatcactc 


tctgctggca 


ggattcttcc 


3000 


cgctccccac 


ctacccagct 


gatgggggtt 


ggggtgcttc 


tttcagccaa 


ggctatgaag 


3060 


ggacagctgc 


tgggacccac 


ctcccccctt 


ccccggccac 


atgccgcgtc 


cctgccccca 


3120 


cccgggtctg 


gtgctgagga 


tacagctctt 


ctcagtgtct 


gaacaatctc 


caaaattgaa 


3180 


atgtatattt 


ttgctaggag 


ccccagcttc 


ctgtgttttt 


aatataaata 


gtgtacacag 


3240 


actgacgaaa 


ctttaaataa 


atgggaatta 


aatatttaaa 


aaaaaaagcg 


gccgcgaatt 


3300 


c 












3301 



<210> 14 

<211> 3083 

<212> DNA 

<213> HUMAN 

<400> 14 



aaaaactgca 


gccaacttcc 


gaggcagcct 


cattgcccag 


cggaccccag 


cctctgccag 


60 


gttcggtccg 


ccatcctcgt 


cccgtcctcc 


gccggcccct 


gccccgcgcc 


cagggatcct 


120 


ccagctcctt 


tcgcccgcgc 


cctccgttcg 


ctccggacac 


catggacaag 


ttttggtggc 


180 


acgcagcctg 


gggactctgc 


ctcgtgccgc 


tgagcctggc 


gcagatcgat 


ttgaatataa 


240 


cctgccgctt 


tgcaggtgta 


ttceacgtgg 


agaaaaatgg 


tcgctacagc 


atctctcgga 


300 


cggaggccgc 


tgacctctgc 


aaggctttca 


atagcacctt 


gcccacaatg 


gcccagatgg 


360 


agaaagctct 


gagcatcgga 


tttgagacct 


gcaggtatgg 


gttcatagaa 


gggcacgtgg 


420 


tgattccccg 


gatccacccc 


aactccatct 


gtgcagcaaa 


caacacaggg 


gtgtacatcc 


480 


tcacatccaa 


cacctcccag 


tatgacacat 


attgcttcaa 


tgcttcagct 


ccacctgaag 


540 


aagattgtac 


atcagtcaca 


gacctgccca 


atgcctttga 


tggaccaatt 


accataacta 


600 


ttgttaaccg 


tgatggcacc 


cgctatgtcc 


agaaaggaga 


atacagaacg 


aatcctgaag 


660 


acatctaccc 


cagcaaccct 


actgatgatg 


acgtgagcag 


cggcttttct 


actgtacacc 


720 


ccatcccaga 


cgaagacagt 


ccctggatca 


cctcctccag 


tgaaaggagc 


agcacttcag 


780 


gaggttacat 


cttttacacc 


gacagcacag 


acagaatccc 


tgctaccact 


ttgatgagca 


840 


ctagtgctac 


agcaactgag 


acagcaacca 


agaggcaaga 


aacctgggat 


tggttttcat 


900 


ggttgtttct 


accatcagag 


tcaaagaatc 


atcttcacac 


aacaacacaa 


atggctggta 


960 


cgtcttcaaa 


taccatctca 


gcaggctggg 


agccaaatga 


agaaaatgaa 


gatgaaagag 


1020 


acagacacct 


cagtttttct 


ggatcaggca 


ttgatgatga 


tgaagatttt 


atctccagca 


1080 


ccatttcaac 


cacaccacgg 


gcttttgacc 


acacaaaaca 


gaaccaggac 


tggacccagt 


1140 
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ggaacccaag 


ccattcaaat 


ccggaagtgc 


tacttcagac 


aaccacaagg 


atgactgatg 


1200 


tagacagaaa 


tggcaccact 


gcttatgaag 


gaaactggaa 


cccagaagca 


caccctcccc 


1260 


tcattcacca 


tgagcatcat 


gaggaagaag 


agaccccaca 


ttctacaagc 


acaatccagg 


1320 


caactcctag 


tagtacaacg 


gaagaaacag 


ctacccagaa 


ggaacagtgg 


tttggcaaca 


1380 


gatggcatga 


gggatatcgc 


caaacaccca 


aagaagactc 


ccatttcaac 


ccaatctcac 


1440 


accccatggg 


acgaggtcat 


caagcaggaa 


gatcgacaac 


agggacagct 


gcagcctcag 


1500 


ctcataccag 


ccatccaatg 


caaggaagga 


caacaccaag 


cccagaggac 


agttcctgga 


1560 


ctgatttcag 


gatggatatg 


gactccagtc 


atagtataac 


gcttcagcct 


actgcaaatc 


1620 


caaacacagg 


tttggtggaa 


gatttggaca 


ggacaggacc 


tctttcaatg 


acaacgcagc 


1680 


agagtaattc 


tcagagcttc 


tctacatcac 


atgaaggctt 


ggaagaagat 


aaagaccatc 


1740 


caacaacttc 


tactctgaca 


tcaagcaata 


ggaatgatgt 


cacaggtgga 


agaagagacc 


1800 


caaatcattc 


tgaaggctca 


actactttac 


tggaaggtta 


tacctctcat 


tacccacaca 


1860 


cgaaggaaag 


caggaccttc 


atcccagtga 


cctcagctaa 


gactgtcaat 


cgttccttat 


1920 


caggagacca 


agacacattc 


caccccagtg 


gggggtcctt 


tggagttact 


gcagttactg 


1980 


ttggagattc 


caactctaat 


gggtcccata 


ccactcatgg 


atctgaatca 


gatggacact 


2040 


cacatgggag 


tcaagaaggt 


ggagcaaaca 


caacctctgg 


tcctataagg 


acaccccaaa 


2100 


ttccagaatg 


gctgatcatc 


ttggcatccc 


tcttggcctt 


ggctttgatt 


cttgcagttt 


2160 


gcattgcagt 


caacagtcga 


agaaggtgtg 


ggcagaagaa 


aaagctagtg 


atcaacagtg 


2220 


gcaatggagc 


tgtggaggac 


agaaagccaa 


gtggactcaa 


cggagaggcc 


agcaagtctc 


2280 


aggaaatggt 


gcatttggtg 


aacaaggagt 


cgtcagaaac 


tccagaccag 


tttatgacag 


2340 


ctgatgagac 


aaggaacctg 


cagaatgtgg 


acatgaagat 


tggggtgtaa 


cacctacacc 


2400 


attatcttgg 


aaagaaacaa 


ccgttggaaa 


cataaccatt 


acagggagct 


gggacactta 


2460 


acagatgcaa 


tgtgctactg 


attgtttcat 


tgcgaatctt 


ttttagcata 


aaattttcta 


2520 


ctctttttgt 


tttttgtgtt 


ttgttcttta 


aagtcaggtc 


caatttgtaa 


aaacagcatt 


2580 


gctttctgaa 


attagggccc 


aattaataat 


cagcaagaat 


ttgatcgttc 


cagttcccac 


2640 


ttggaggcct 


ttcatccctc 


gggtgtgcta 


tggatggctt 


ctaacaaaaa 


ctacacatat 


2700 


gtattcctga 


tcgccaacct 


ttcccccacc 


agctaaggac 


atttcccagg 


gttaataggg 


2760 


cctaatccct 


qqqaqqaaat 


ttqaatqqqt 


ccattttgcc 


cttccatagc 


ctaatccctg 


2820 


ggcattgctt 


tccactgagg 


ttgggggttg 


gggtgtacta 


gttacacatc 


ttcaacagac 


2880 


cccctctaga 


aatttttcag 


atgcttctgg 


gagacaccca 


aagggtgaag 


ctatttatct 


2940 


gtagtaaact 


atttatctgt 


gtttttgaaa 


tattaaaccc 


tggatcagtc 


ctttgatcag 


3000 


tataattttt 


taaagttact 


ttgtcagagg 


cacaaaaggg 


tttaaactga 


ttcataataa 


3060 
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<210> 15 

<211> 2539 

<212> DNA 

<213> HUMAN 

<400> 15 



ggagtctctt 


gctctggttc 


ttgctgttcc 


tgctcctgct 


cccgccgctc 


cccgtcctgc 


60 


tcgcggaccc 


aggggcgccc 


acgccagtga 


atccctgttg 


ttactatcca 


tgccagcacc 


120 


agggcatctg 


tgtccgcttc 


ggccttgacc 


gctaccagtg 


tgactgcacc 


cgcacgggct 


180 


attccggccc 


caactgcacc 


atccctggcc 


tgtggacctg 


gctccggaat 


tcactgcggc 


240 


ccagcccctc 


tttcacccac 


ttcctgctca 


ctcacgggcg 


ctggttctgg 


gagtttgtca 


300 


atgccacctt 


catccgagag 


atgctcatgc 


gcctggtact 


cacagtgcgc 


tccaacctta 


360 


tccccagtcc 


ccccacctac 


aactcagcac 


atgactacat 


cagctgggag 


tctttctcca 


420 


acgtgagcta 


ttacactcgt 


attctgccct 


ctgtgcctaa 


agattgcccc 


acacccatgg 


480 


gaaccaaagg 


gaagaagcag 


ttgccagatg 


cccagctcct 


ggcccgccgc 


ttcctgctca 


540 


ggaggaagtt 


catacctgac 


ccccaaggca 


ccaacctcat 


gtttgccttc 


tttgcacaac 


600 


acttcaccca 


ccagttcttc 


aaaacttctg 


gcaagatggg 


tcctggcttc 


accaaggcct 


660 


tgggccatgg 


ggtagacctc 


ggccacattt 


atggagacaa 


tctggagcgt 


cagtatcaac 


720 


tgcggctctt 


taaggatggg 


aaactcaagt 


accaggtgct 


ggatggagaa 


atgtacccgc 


780 


cctcggtaga 


agaggcgcct 


gtgttgatgc 


actacccccg 


aggcatcccg 


ccccagagcc 


840 


agatggctgt 


gggccaggag 


gtgtttgggc 


tgcttcctgg 


gctcatgctg 


tatgccacgc 


900 


tctggctacg 


tgagcacaac 


cgtgtgtgtg 


acctgctgaa 


ggctgagcac 


cccacctggg 


960 


gcgatgagca 


gcttttccag 


acgacccgcc 


tcatcctcat 


aggggagacc 


atcaagattg 


1020 


tcatcgagga 


gtacgtgcag 


cagctgagtg 


gctatttcct 


gcagctgaaa 


tttgacccag 


1080 


agctgctgtt 


cggtgtccag 


ttccaatacc 


gcaaccgcat 


tgccatggag 


ttcaaccatc 


1140 


tctaccactg 


gcaccccctc 


atgcctgact 


ccttcaaggt 


gggctcccag 


gagtacagct 


1200 


acgagcagtt 


cttgttcaac 


acctccatgt 


tggtggacta 


tggggttgag 


gccctggtgg 


1260 


atgccttctc 


tcgccagatt 


gctggccgga 


tcggtggggg 


caggaacatg 


gaccaccaca 


1320 


tcctgcatgt 


ggctgtggat 


gtcatcaggg 


agtctcggga 


gatgcggctg 


cagcccttca 


1380 


atgagtaccg 


caagaggttt 


ggcatgaaac 


cctacacctc 


cttccaggag 


ctcgtaggag 


1440 


agaaggagat 


ggcagcagag 


ttggaggaat 


tgtatggaga 


cattgatgcg 


ttggagttct 


1500 


accctggact 


gcttcttgaa 


aagtgccatc 


caaactctat 


ctttggggag 


agtatgatag 


1560 


agattggggc 


tcccttttcc 


ctcaagggtc 


tcctagggaa 


tcccatctgt 


tctccggagt 


1620 
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actggaagcc 


gagcacattt 


ggcggcgagg 


tgggctttaa 


cattgtcaag 


acggccacac 


1 CCA 


tgaagaagct 


ggtctgcctc 


aacaccaaga 


cctgtcccta 


cgtttccttc 


cgtgtgccgg 




atgccagtca 


ggatgatggg 


cctgctgtgg 


agcgaccatc 


cacagagctc 


tgaggggcag 


*i oaa 
loUU 


gaaagcagca 


ttctggaggg 


gagagctttg 


tgcttgtcat 


tccagagtgc 


tgaggccagg 


lobO 


gctgatggtc 


ttaaatgctc 


attttctggt 


ttggcatggt 


gagtgttggg 


gttgacattt 


1920 


agaactttaa 


gtctcaccca 


ttatctggaa 


tattgtgatt 


ctgtttattc 


ttccagaatg 


1980 


ctgaactcct 


tgttagccct 


tcagattgtt 


aggagtggtt 


ctcatttggt 


ctgccagaat 


T A/1 A 

2040 


actgggttct 


tagttgacaa 


cctagaatgt 


cagatttctg 


gttgatttgt 


aacacagtca 


2100 


ttctaggatg 


tggagctact 


gatgaaatct 


gctagaaagt 


tagggggttc 


ttattttgca 


ZlbU 


ttccagaatc 


ttgactttct 


gattggtgat 


tcaaagtgtt 


gtgttcctgg 


ctgatgatcc 


2220 


agaacagtgg 


ctcgtatccc 


aaatctgtca 


gcatctggct 


gtctagaatg 


_4_ _ _ _ _1_ _4_ ™ _ -|— 

tggat ttgat 


ZZoU 


tcattttcct 


gttcagtgag 


atatcataga 


gacggagatc 


ctaaggtcca 


acaagaatgc 


2340 


attccctgaa 


tctgtgcctg 


cactgagagg 


gcaaggaagt 


ggggtgttct 


tcttgggacc 


2400 


cccactaaga 


ccctggtctg 


aggatgtaga 


gagaacaggt 


gggctgtatt 


cacgccattg 


2460 


gttggaagct 


accagagctc 


tatccccatc 


caggtcttga 


ctcatggcag 


ctgtttctca 


2520 


tgaagctaat 


aaaattcgc 










2539 


<210> 16 
<211> 369 
<212> DNA 
<213> HUMAN 












<400> 16 
atgaagcttc 


tcacgggcct 


ggttttctgc 


tccttggtcc 


tgggtgtcag 


cagccgaagc 


oU 


ttcttttcgt 


tccttggcga 


ggcttttgat 


ggggctcggg 


acatgtggag 


agcctactct 


120 


gacatgagag 


aagccaatta 


catcggctca 


gacaaatact 


tccatgctcg 


ggggaactat 


180 


gatgctgcca 


aaaggggacc 


tgggggtgtc 


tgggctgcag 


aagcgatcag 


cgatgccaga 


240 


gagaatatcc 


agagattctt 


tggccatggt 


gcggaggact 


cgctggctga 


tcaggctgcc 


300 


aatgaatggg 


gcaggagtgg 


caaagacccc 


aatcacttcc 


gacctgctgg 


cctgcctgag 


360 



aaatactga 



<210> 17 

<211> 67 

<212> PRT 

<213> HUMAN 

<400> 17 

Met Thr Sen Lys Leu Ala val Ala Leu Leu Ala Ala Phe Leu lie Ser 
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1 5 10 15 

Ala Ala Leu Cys Glu Gly Ala val Leu Pro Arg Ser Ala Lys Glu Leu 
20 25 30 

Arg Cys Gin cys lie Lys Thr Tyr Ser Lys Pro Phe His Pro Lys Phe 
35 40 45 

lie Lys Glu Leu Arg Val lie Glu Ser Gly Pro His Cys Ala Asn Thr 
50 55 60 

Glu lie Met 
65 

<210> 18 
<211> 604 
<212> PRT 
<213> HUMAN 

<400> 18 

Met Leu Ala Arg Ala Leu Leu Leu Cys Ala Val Leu Ala Leu Ser His 
15 10 15 

Thr Ala Asn Pro Cys Cys Ser His Pro Cys Gin Asn Arg Gly val Cys 
20 25 30 

Met ser val Gly Phe Asp Gin Tyr Lys cys Asp Cys Thr Arg Thr Gly 
35 40 45 

Phe Tyr Gly Glu Asn Cys Ser Thr Pro Glu Phe Leu Thr Arg lie Lys 
50 55 60 

Leu Phe Leu Lys Pro Thr Pro Asn Thr val His Tyr lie Leu Thr His 
65 70 75 80 

Phe Lys Gly Phe Trp Asn val Val Asn Asn He Pro Phe Leu Arg Asn 
85 90 95 

Ala lie Met ser Tyr val Leu Thr ser Arg Ser His Leu lie Asp Ser 
100 105 110 

Pro Pro Thr Tyr Asn Ala Asp Tyr Gly Tyr Lys Ser Trp Glu Ala Phe 
115 120 125 

Ser Asn Leu ser Tyr Tyr Thr Arg Ala Leu Pro Pro val Pro Asp Asp 
130 135 140 

Cys Pro Thr Pro Leu Gly Val Lys Gly Lys Lys Gin Leu Pro Asp Ser 
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145 



150 



NLEE01001WO0.ST2 
155 



5.txt 



160 



Asn Glu lie val Glu Lys Leu Leu Leu Arg Arg Lys Phe He Pro Asp 
165 170 175 



Pro Gin Gly Ser Asn Met Met Phe Ala Phe Phe Ala Gin His Phe Thr 
180 185 190 



His Gin Phe Phe Lys Thr Asp His Lys Arg Gly Pro Ala Phe Thr Asn 
195 200 205 



Gly Leu Gly His Gly Val Asp Leu Asn His lie Tyr Gly Glu Thr Leu 
210 215 220 



Ala Arg Gin Arg Lys Leu Arg Leu Phe Lys Asp Gly Lys Met Lys Tyr 
225 230 235 240 



Gin lie lie Asp Gly Glu Met Tyr Pro Pro Thr Val Lys Asp Thr Gin 
245 250 255 



Ala Glu Met lie Tyr Pro Pro Gin Val Pro Glu His Leu Arg Phe Ala 
260 265 270 



Val Gly Gin Glu val Phe Gly Leu Val Pro Gly Leu Met Met Tyr Ala 
275 280 285 



Thr He Trp Leu Arg Glu His Asn Arg val Cys Asp Val Leu Lys Gin 
290 ~ 295 300 



Glu His Pro Glu Trp Gly Asp Glu Gin Leu Phe Gin Thr Ser Arg Leu 
305 310 315 320 



lie Leu lie Gly Glu Thr lie Lys lie val lie Glu Asp Tyr val Gin 
325 330 335 



His Leu Ser Gly Tyr His Phe Lys Leu Lys Phe Asp Pro Glu Leu Leu 
340 345 350 



Phe Asn Lys Gin Phe Gin Tyr Gin Asn Arg lie Ala Ala Glu Phe Asn 
355 360 365 



Thr Leu Tyr His Trp His Pro Leu Leu Pro Asp Thr Phe Gin lie His 
370 375 380 



Asp Gin Lys Tyr Asn Tyr Gin Gin Phe lie Tyr Asn Asn Ser lie Leu 



385 



390 



395 



400 
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Leu Glu His Gly lie Thr Gin Phe val Glu Ser Phe Thr Arg Gin lie 
405 410 415 

Ala Gly Arg val Ala Gly Gly Arg Asn val Pro Pro Ala val Gin Lys 
420 425 430 

val Ser Gin Ala ser lie Asp Gin ser Arg Gin Met Lys Tyr Gin Ser 
435 440 445 

Phe Asn Glu Tyr Arg Lys Arg Phe Met Leu Lys Pro Tyr Glu Ser Phe 
450 455 460 

Glu Glu Leu Thr Gly Glu Lys Glu Met ser Ala Glu Leu Glu Ala Leu 
465 470 475 480 

Tyr Gly Asp lie Asp Ala val Glu Leu Tyr Pro Ala Leu Leu Val Glu 
485 490 495 

Lys Pro Arg Pro Asp Ala lie Phe Gly Glu Thr Met val Glu val Gly 
500 505 510 

Ala Pro Phe Ser Leu Lys Gly Leu Met Gly Asn Val lie Cys Ser Pro 
515 520 525 

Ala Tyr Trp Lys Pro ser Thr Phe Gly Gly Glu val Gly Phe Gin lie 
530 535 540 

lie Asn Thr Ala Ser lie Gin Ser Leu lie cys Asn Asn Val Lys Gly 
545 550 555 560 

Cys pro Phe Thr Ser Phe Ser Val Pro Asp Pro Glu Leu lie Lys Thr 
565 570 575 

Val Thr lie Asn Ala Ser Ser ser Arg ser Gly Leu Asp Asp lie Asn 
580 585 590 

pro Thr val Leu Leu Lys Glu Arg Ser Thr Glu Leu 
595 600 

<210> 19 

<211> 360 

<212> PRT 

<213> HUMAN 

<400> 19 

Met Glu Asp Phe Asn Met Glu ser Asp Ser Phe Glu Asp Phe Trp Lys 
15 10 15 
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Gly Glu Asp Leu Ser Asn Tyr Ser Tyr ser Ser Thr Leu Pro Pro Phe 
20 25 30 

Leu Leu Asp Ala Ala Pro Cys Glu Pro Glu Ser Leu Glu lie Asn Lys 
35 40 45 

Tyr Phe val val lie lie Tyr Ala Leu Val Phe Leu Leu ser Leu Leu 
50 55 60 

Gly Asn ser Leu Val Met Leu val lie Leu Tyr Ser Arg Val Gly Arg 
65 70 75 80 

ser Val Thr Asp val Tyr Leu Leu Asn Leu Ala Leu Ala Asp Leu Leu 
85 90 95 

Phe Ala Leu Thr Leu Pro lie Trp Ala Ala ser Lys val Asn Gly Trp 
100 105 110 

lie Phe Gly Thr Phe Leu Cys Lys val val Ser Leu Leu Lys Glu val 
115 120 125 

Asn Phe Tyr Ser Gly lie Leu Leu Leu Ala Cys lie Ser val Asp Arg 
130 135 140 

Tyr Leu Ala lie val His Ala Thr Arg Thr Leu Thr Gin Lys Arg Tyr 
145 150 ~ 155 160 

Leu val Lys Phe lie Cys Leu Ser lie Trp Gly Leu ser Leu Leu Leu 
165 170 175 

Ala Leu Pro Val Leu Leu Phe Arg Arg Thr val Tyr ser Ser Asn val 
180 185 190 

Ser pro Ala Cys Tyr Glu Asp Met Gly Asn Asn Thr Ala Asn Trp Arg 
195 200 205 

Met Leu Leu Arg lie Leu Pro Gin Ser Phe Gly Phe lie val Pro Leu 
210 215 220 

Leu lie Met Leu Phe Cys Tyr Gly Phe Thr Leu Arg Thr Leu Phe Lys 
225 230 235 240 

Ala His Met Gly Gin Lys His Arg Ala Met Arg val lie Phe Ala Val 
245 250 255 

Val Leu lie Phe Leu Leu Cys Trp Leu Pro Tyr Asn Leu val Leu Leu 
260 265 270 
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Ala Asp Thr Leu Met Arg Thr Gin Val lie Gin Glu Thr Cys Glu Arg 
275 280 285 

Arg Asn His lie Asp Arg Ala Leu Asp Ala Thr Glu lie Leu Gly lie 
290 295 300 

Leu His Ser Cys Leu Asn Pro Leu lie Tyr Ala Phe lie Gly Gin Lys 
305 310 315 320 

Phe Arg His Gly Leu Leu Lys lie Leu Ala lie His Gly Leu lie Ser 
325 330 335 

Lys Asp ser Leu Pro Lys Asp Ser Arg Pro Ser Phe val Gly ser Ser 
340 345 350 

Ser Gly His Thr ser Thr Thr Leu 
355 360 

<210> 20 

<211> 554 

<212> PRT 

<213> HUMAN 

<400> 20 

Met Thr Ala Pro Gly Ala Ala Gly Arg Cys Pro Pro Thr Thr Trp Leu 
1 5 10 15 

Gly Ser Leu Leu Leu Leu val cys Leu Leu Ala Ser Arg Ser lie Thr 
20 25 30 

Glu Glu val ser Glu Tyr cys Ser His Met lie Gly Ser Gly His Leu 
35 40 45 

Gin Ser Leu Gin Arg Leu lie Asp Ser Gin Met Glu Thr Ser cys Gin 
50 55 60 

lie Thr Phe Glu Phe val Asp Gin Glu Gin Leu Lys Asp Pro Val Cys 
65 70 75 80 

Tyr Leu Lys Lys Ala Phe Leu Leu Val Gin Asp lie Met Glu Asp Thr 
85 90 95 

Met Arg Phe Arg Asp Asn Thr Ala Asn Pro lie Ala lie Val Gin Leu 
100 105 110 

Gin Glu Leu Ser Leu Arg Leu Lys Ser cys Phe Thr Lys Asp Tyr Glu 
115 120 125 
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Glu His Asp Lys Ala Cys val Arg Thr Phe Tyr Glu Thr Pro Leu Gin 
130 135 140 

Leu Leu Glu Lys val Lys Asn val Phe Asn Glu Thr Lys Asn Leu Leu 
145 150 155 160 

Asp Lys Asp Trp Asn lie Phe Ser Lys Asn Cys Asn Asn ser Phe Ala 
165 170 175 

Glu Cys Ser ser Gin Asp val val Thr Lys Pro Asp Cys Asn Cys Leu 
180 185 190 

Tyr Pro Lys Ala lie Pro ser Ser Asp Pro Ala Ser val Ser Pro His 
195 200 205 

Gin Pro Leu Ala Pro ser Met Ala Pro val Ala Gly Leu Thr Trp Glu 
210 215 220 

Asp Ser Glu Gly Thr Glu Gly Ser Ser Leu Leu Pro Gly Glu Gin Pro 
225 230 235 240 

Leu His Thr Val Asp Pro Gly Ser Ala Lys Gin Arg Pro Pro Arg Ser 
245 250 255 

Thr Cys Gin Ser Phe Glu Pro Pro Glu Thr Pro Val Val Lys Asp Ser 
260 265 270 

Thr lie Gly Gly Ser Pro Gin Pro Arg pro Ser val Gly Ala Phe Asn 
275 280 285 

pro Gly Met Glu Asp lie Leu Asp Ser Ala Met Gly Thr Asn Trp val 
290 295 300 

pro Glu Glu Ala Ser Gly Glu Ala Ser Glu lie Pro val Pro Gin Gly 
305 310 315 320 

Thr Glu Leu ser Pro Ser Arg Pro Gly Gly Gly ser Met Gin Thr Glu 
325 ~ 330 335 

Pro Ala Arg Pro Ser Asn Phe Leu Ser Ala Ser Ser Pro Leu Pro Ala 
340 345 350 

Ser Ala Lys Gly Gin Gin Pro Ala Asp val Thr Ala Thr Ala Leu Pro 
355 360 365 

Arq Val Gly Pro Val Met Pro Thr Gly Gin Asp Trp Asn His Thr Pro 
370 375 380 
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Gin Lys Thr Asp His Pro ser Ala Leu Leu Arg Asp Pro Pro Glu Pro 
385 390 395 400 

Gly Ser Pro Arq lie Ser Ser Leu Arg Pro Gin Ala Leu Ser Asn Pro 
405 410 415 

Ser Thr Leu Ser Ala Gin Pro Gin Leu Ser Arg Ser His ser Ser Gly 
420 425 430 

Ser Val Leu Pro Leu Gly Glu Leu Glu Gly Arg Arg Ser Thr Arg Asp 
435 440 445 

Arg Thr Ser Pro Ala Glu Pro Glu Ala Ala Pro Ala Ser Glu Gly Ala 
450 455 460 

Ala Arq Pro Leu Pro Arg Phe Asn ser val Pro Leu Thr Asp Thr Gly 
465 470 475 480 

His Glu Arg Gin Ser Glu Gly ser ser Ser Pro Gin Leu Gin Glu Ser 
485 490 495 

val Phe His Leu Leu val Pro ser val lie Leu val Leu Leu Ala val 
500 505 510 

Gly Gly Leu Leu Phe Tyr Arg Trp Arg Arg Arg Ser His Gin Glu Pro 
515 520 525 

Gin Arq Ala Asp Ser Pro Leu Glu Gin Pro Glu Gly Ser Pro Leu Thr 
530 535 540 

Gin Asp Asp Arg Gin val Glu Leu Pro val 
545 550 

<210> 21 

<211> 107 

<212> PRT 

<213> HUMAN 

<400> 21 

Met Ala Arg Ala Ala Leu Ser Ala Ala Pro Ser Asn Pro Arg Leu Leu 
15 10 15 

Arq val Ala Leu Leu Leu Leu Leu Leu val Ala Ala Gly Arg Arg Ala 
20 25 30 

Ala Gly Ala ser val Ala Thr Glu Leu Arg cys Gin Cys Leu Gin Thr 
35 40 45 
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Leu Gin Gly lie His Pro Lys Asn lie Gin Ser Val Asn Val Lys ser 
50 55 60 

pro Gly Pro His Cys Ala Gin Thr Glu val lie Ala Thr Leu Lys Asn 
65 70 75 80 

Gly Arg Lys Ala Cys Leu Asn Pro Ala Ser Pro lie val Lys Lys lie 
85 90 95 

lie Glu Lys Met Leu Asn Ser Asp Lys Ser Asn 
100 105 

<210> 22 
<211> 106 
<212> PRT 
<213> HUMAN 

<400> 22 

Met Ala His Ala Thr Leu Ser Ala Ala Pro ser Asn Pro Arg Leu Leu 
15 10 15 

Arg val Ala Leu Leu Leu Leu Leu Leu Val Gly Ser Arg Arg Ala Ala 
20 25 30 

Gly Ala Ser val Val Thr Glu Leu Arg Cys Gin cys Leu Gin Thr Leu 
35 40 45 

Gin Gly lie His Leu Lys Asn lie Gin Ser val Asn Val Arg Ser Pro 
50 55 60 

Gly Pro His Cys Ala Gin Thr Glu val lie Ala Thr Leu Lys Asn Gly 
65 70 75 80 

Lys Lys Ala Cys Leu Asn Pro Ala Ser Pro Met val Gin Lys lie lie 
85 90 95 

Glu Lys lie Leu Asn Lys Gly Ser Thr Asn 
100 105 

<210> 23 
<211> 300 
<212> PRT 
<213> HUMAN 

<400> 23 

Met Arg lie Ala val lie Cys Phe Cys Leu Leu Gly lie Thr cys Ala 
1 5 10 15 
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lie Pro val Lys Gin Ala Asp Ser Gly Ser ser Glu Glu Lys Gin Leu 
20 25 30 

Tyr Asn Lys Tyr Pro Asp Ala val Ala Thr Trp Leu Asn Pro Asp Pro 
35 40 45 

Ser Gin Lys Gin Asn Leu Leu Ala Pro Gin Thr Leu Pro Ser Lys Ser 
50 55 60 

Asn Glu Ser His Asp His Met Asp Asp Met Asp Asp Glu Asp Asp Asp 
65 70 75 80 

Asp His val Asp Ser Gin Asp Ser lie Asp Ser Asn Asp ser Asp Asp 
85 90 95 

val Asp Asp Thr Asp Asp Ser His Gin Ser Asp Glu Ser His His ser 
100 105 110 

Asp Glu Ser Asp Glu Leu val Thr Asp Phe Pro Thr Asp Leu Pro Ala 
115 120 125 

Thr Glu val Phe Thr Pro val val Pro Thr val Asp Thr Tyr Asp Gly 
130 135 140 

Arg Gly Asp Ser val val Tyr Gly Leu Arg Ser Lys Ser Lys Lys Phe 
145 150 155 160 

Arg Arg Pro Asp lie Gin Tyr Pro Asp Ala Thr Asp Glu Asp lie Thr 
165 170 175 

Ser His Met Glu Ser Glu Glu Leu Asn Gly Ala Tyr Lys Ala lie Pro 
180 185 190 

val Ala Gin Asp Leu Asn Ala Pro Ser Asp Trp Asp Ser Arg Gly Lys 
195 200 205 

Asp ser Tyr Glu Thr Ser Gin Leu Asp Asp Gin ser Ala Glu Thr His 
210 215 220 

Ser His Lys Gin Ser Arg Leu Tyr Lys Arg Lys Ala Asn Asp Glu Ser 
225 230 235 240 

Asn Glu His Ser Asp val lie Asp Ser Gin Glu Leu ser Lys val ser 
245 250 255 

Arq Glu Phe His ser His Glu Phe His ser His Glu Asp Met Leu val 
260 265 270 
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val Asp Pro Lys ser Lys Glu Glu Asp Lys His Leu Lys Phe Arg lie 
275 280 285 

ser His Glu Leu Asp Ser Ala Ser Ser Glu val Asn 
290 295 300 

<210> 24 

<211> 295 

<212> PRT 

<213> HUMAN 

<400> 24 

Met Glu His Gin Leu Leu Cys Cys Glu val Glu Thr lie Arg Arg Ala 
15 10 15 

Tyr Pro Asp Ala Asn Leu Leu Asn Asp Arg val Leu Arg Ala Met Leu 
20 25 30 

Lys Ala Glu Glu Thr Cys Ala Pro Ser val Ser Tyr Phe Lys cys val 
35 40 45 

Gin Lys Glu val Leu Pro Ser Met Arg Lys lie val Ala Thr Trp Met 
50 55 60 

Leu Glu Val Cys Glu Glu Gin Lys Cys Glu Glu Glu Val Phe Pro Leu 
65 70 75 80 

Ala Met Asn Tyr Leu Asp Arg Phe Leu ser Leu Glu Pro val Lys Lys 
85 ~ 90 95 

Ser Arg Leu Gin Leu Leu Gly Ala Thr Cys Met Phe val Ala Ser Lys 
100 105 110 

Met Lys Glu Thr lie Pro Leu Thr Ala Glu Lys Leu cys lie Tyr Thr 
115 120 125 

Asp Gly Ser lie Arg Pro Glu Glu Leu Leu Gin Met Glu Leu Leu Leu 
130 135 140 

val Asn Lys Leu Lys Trp Asn Leu Ala Ala Met Thr Pro His Asp Phe 
145 150 155 160 

lie Glu His Phe Leu Ser Lys Met Pro Glu Ala Glu Glu Asn Lys Gin 
165 170 175 

lie lie Arg Lys His Ala Gin Thr Phe val Ala Ser cys Ala Thr Asp 
180 185 190 
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val Lys Phe lie Ser Asn Pro Pro Ser Met Val Ala Ala Gly ser val 
195 200 205 

val Ala Ala val Gin Gly Leu Asn Leu Arg Ser Pro Asn Asn Phe Leu 
210 215 220 

Ser Tyr Tyr Arg Leu Thr Arg Phe Leu Ser Arg val lie Lys Cys Asp 
225 230 235 240 

Pro Asp Cys Leu Arg Ala Cys Gin Glu Gin lie Glu Ala Leu Leu Glu 
245 250 255 

Ser Ser Leu Arg Gin Ala Gin Gin Asn Met Asp Pro Lys Ala Ala Glu 
260 265 270 

Glu Glu Glu Glu Glu Glu Glu Gl u Val Asp Leu Ala Cys Thr Pro Thr 
275 280 285 

Asp val Arg Asp val Asp lie 
290 295 

<210> 25 

<211> 439 

<212> PRT 

<213> HUMAN 

<400> 25 

Met Pro Leu Asn Val ser Phe Thr Asn Arg Asn Tyr Asp Leu Asp Tyr 
15 10 15 

Asp Ser val Gin Pro Tyr Phe Tyr Cys Asp Glu Glu Glu Asn Phe Tyr 
20 25 30 

Gin Gin Gin Gin Gin Ser Glu Leu Gin Pro Pro Ala Pro ser Glu Asp 
35 40 45 

lie Trp Lys Lys Phe Glu Leu Leu Pro Thr Pro Pro Leu Ser Pro Ser 
50 55 60 

Arg Arg ser Gly Leu Cys Ser Pro ser Tyr val Ala val Thr Pro Phe 
65 70 75 80 

Ser Leu Arg Gly Asp Asn Asp Gly Gly Gly Gly ser Phe Ser Thr Ala 
85 90 95 

Asp Gin Leu Glu Met val Thr Glu Leu Leu Gly Gly Asp Met val Asn 
100 105 110 
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Gin Ser Phe lie Cys Asp Pro Asp Asp Glu Thr Phe lie Lys Asn lie 
115 120 125 

lie He Gin Asp Cys Met Trp Ser Gly Phe ser Ala Ala Ala Lys Leu 
130 135 140 

val Ser Glu Lys Leu Ala ser Tyr Gin Ala Ala Arg Lys Asp ser Gly 
145 150 155 160 

Ser Pro Asn Pro Ala Arg Gly His Ser Val Cys Ser Thr Ser ser Leu 
165 170 175 

Tyr Leu Gin Asp Leu Ser Ala Ala Ala ser Glu cys lie Asp Pro Ser 
180 185 190 

val val Phe Pro Tyr Pro Leu Asn Asp ser ser Ser Pro Lys Ser Cys 
195 200 205 

Ala Ser Gin Asp Ser ser Ala Phe Ser Pro Ser Ser Asp Ser Leu Leu 
210 215 220 

Ser Ser Thr Glu Ser Ser Pro Gin Gly Ser Pro Glu Pro Leu val Leu 
225 230 235 240 

His Glu Glu Thr Pro Pro Thr Thr ser Ser Asp Ser Glu Glu Glu Gin 
245 250 255 

Glu Asp Glu Glu Glu lie Asp val val Ser val Glu Lys Arg Gin Ala 
260 265 270 

pro Gly Lys Arg Ser Glu Ser Gly Ser Pro Ser Ala Gly Gly His Ser 
275 280 285 

Lys Pro Pro His Ser Pro Leu val Leu Lys Arg Cys His val Ser Thr 
290 295 300 

His Gin His Asn Tyr Ala Ala Pro Pro Ser Thr Arg Lys Asp Tyr Pro 
305 310 315 320 

Ala Ala Lys Arg Val Lys Leu Asp Ser val Arg Val Leu Arg Gin lie 
325 330 335 

Ser Asn Asn Arg Lys Cys Thr Ser Pro Arg ser ser Asp Thr Glu Glu 
340 345 350 

Asn Val Lys Arg Arg Thr His Asn val Leu Glu Arg Gin Arg Arg Asn 
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355 360 365 

Glu Leu Lys Arg ser Phe Phe Ala Leu Arg Asp Gin lie Pro Glu Leu 
370 375 380 

Glu Asn Asn Glu Lys Ala Pro Lys Val Val lie Leu Lys Lys Ala Thr 
385 390 395 400 

Ala Tyr lie Leu Ser Val Gin Ala Glu Glu Gin Lys Leu lie ser Glu 
405 410 415 

Glu Asp Leu Leu Arg Lys Arg Arg Glu Gin Leu Lys His Lys Leu Glu 
420 425 430 

Gin Leu Arg Asn Ser Cys Ala 
435 

<210> 26 

<211> 164 

<212> PRT 

<213> HUMAN 

<400> 26 

Met Ser Glu Pro Ala Gly Asp val Arg Gin Asn Pro Cys Gly Ser Lys 
1 5 10 15 

Ala Cys Arg Arg Leu Phe Gly Pro val Asp Ser Glu Gin Leu Ser Arg 
20 25 30 

Asp Cys Asp Ala Leu Met Ala Gly Cys lie Gin Glu Ala Arg Glu Arg 
35 40 45 

Trp Asn Phe Asp Phe val Thr Glu Thr Pro Leu Glu Gly Asp Phe Ala 
50 55 60 

Trp Glu Arg val Arg Gly Leu Gly Leu Pro Lys Leu Tyr Leu Pro Thr 
65 70 75 80 

Gly Pro Arg Arg Gly Arg Asp Glu Leu Gly Gly Gly Arg Arg Pro Gly 
85 90 95 

Thr ser pro Ala Leu Leu Gin Gly Thr Ala Glu Glu Asp His val Asp 
100 105 110 

Leu Ser Leu Ser cys Thr Leu val Pro Arg ser Gly Glu Gin Ala Glu 
115 120 125 

Gly ser Pro Gly Gly Pro Gly Asp Ser Gin Gly Arg Lys Arg Arg Gin 
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130 135 140 

Thr Ser Met Thr Asp Phe Tyr His Ser Lys Arg Arg Leu lie Phe Ser 
145 150 155 160 

Lys Arg Lys Pro 



<210> 27 

<211> 468 

<212> PRT 

<213> HUMAN 

<400> 27 

Met val Asp Thr Glu Ser Pro Leu cys Pro Leu Ser Pro Leu Glu Ala 
15 10 15 

Gly Asp Leu Glu Ser Pro Leu Ser Glu Glu Phe Leu Gin Glu Met Gly 
20 25 30 

Asn lie Gin Glu lie Ser Gin Ser lie Gly Glu Asp Ser ser Gly Ser 
35 40 45 

Phe Gly Phe Thr Glu Tyr Gin Tyr Leu Gly Ser Cys Pro Gly Ser Asp 
50 55 60 

Gly Ser val lie Thr Asp Thr Leu ser Pro Ala ser ser Pro ser ser 
65 70 75 80 

val Thr Tyr Pro val val Pro Gly Ser val Asp Glu ser Pro Ser Gly 
85 90 95 

Ala Leu Asn lie Glu cys Arg lie Cys Gly Asp Lys Ala Ser Gly Tyr 
100 105 110 

His Tyr Gly val His Ala cys Glu Gly Cys Lys Gly Phe Phe Arg Arg 
115 120 125 

Thr lie Arg Leu Lys Leu val Tyr Asp Lys Cys Asp Arg Ser cys Lys 
130 135 140 

lie Gin Lys Lys Asn Arg Asn Lys Cys Gin Tyr Cys Arg Phe His Lys 
145 150 155 160 

Cys Leu Ser val Gly Met ser His Asn Ala lie Arg Phe Gly Arg Met 
165 170 175 

pro Arg Ser Glu Lys Ala Lys Leu Lys Ala Glu lie Leu Thr Cys Glu 

34 



WO 2006/039405 PCT/US2005/035027 

NLEE01001WO0 . ST2 5 . txt 
180 185 190 

His Asp lie Glu Asp ser Glu Thr Ala Asp Leu Lys Ser Leu Ala Lys 
195 200 205 

Arg lie Tyr Glu Ala Tyr Leu Lys Asn Phe Asn Met Asn Lys val Lys 
210 215 220 

Ala Arg val lie Leu Ser Gly Lys Ala ser Asn Asn Pro Pro Phe Val 
225 230 235 240 

lie His Asp Met Glu Thr Leu Cys Met Ala Glu Lys Thr Leu val Ala 
245 250 255 

Lys Leu val Ala Asn Gly lie Gin Asn Lys Glu Ala Glu val Arg lie 
260 265 270 

Phe His Cys Cys Gin Cys Thr Ser val Glu Thr val Thr Glu Leu Thr 
275 280 285 

Glu Phe Ala Lys Ala lie Pro Gly Phe Ala Asn Leu Asp Leu Asn Asp 
290 295 300 

Gin val Thr Leu Leu Lys Tyr Gly val Tyr Glu Ala lie Phe Ala Met 
305 310 315 320 

Leu Ser ser Val Met Asn Lys Asp Gly Met Leu val Ala Tyr Gly Asn 
325 330 335 

Gly Phe lie Thr Arg Glu Phe Leu Lys Ser Leu Arg Lys Pro Phe Cys 
340 345 350 

Asp lie Met Glu Pro Lys Phe Asp Phe Ala Met Lys Phe Asn Ala Leu 
355 360 365 

Glu Leu Asp Asp ser Asp lie ser Leu Phe val Ala Ala lie lie Cys 
370 375 380 

Cys Gly Asp Arg Pro Gly Leu Leu Asn Val Gly His lie Glu Lys Met 
385 390 395 400 

Gin Glu Gly lie val His val Leu Arg Leu His Leu Gin Ser Asn His 
405 410 415 

Pro Asp Asp lie Phe Leu Phe Pro Lys Leu Leu Gin Lys Met Ala Asp 
420 425 430 
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Leu Arg Gin Leu Val Thr Glu His Ala Gin Leu Val Gin lie lie Lys 
435 440 445 



Lys Thr Glu Ser Asp Ala Ala Leu His Pro Leu Leu Gin Glu lie Tyr 
450 455 460 



Arg Asp Met Tyr 
465 



<210> 28 

<211> 505 

<212> PRT 

<213> HUMAN 

<400> 28 

Met Gly Glu Thr Leu Gly Asp ser Pro lie Asp Pro Glu ser Asp ser 
15 10 15 



Phe Thr Asp Thr Leu ser Ala Asn lie Ser Gin Glu Met Thr Met val 
20 25 30 



Thr Glu Met Pro Phe Trp Pro Thr Asn Phe Gly lie Ser Ser val 
35 40 45 



Asp Leu Ser Val Met Glu Asp His Ser His Ser Phe Asp lie Lys Pro 
50 55 60 



Phe Thr Thr Val Asp Phe Ser Ser lie Ser Thr Pro His Tyr Glu Asp 
65 70 75 80 



lie Pro Phe Thr Arg Thr Asp Pro val val Ala Asp Tyr Lys Tyr Asp 
85 90 95 



Leu Lys Leu Gin Glu Tyr Gin Ser Ala lie Lys val Glu Pro Ala Ser 
100 105 110 



Pro Pro Tyr Tyr Ser Glu Lys Thr Gin Leu Tyr Asn Lys Pro His Glu 
115 120 125 



Glu Pro Ser Asn Ser Leu Met Ala lie Glu Cys Arg Val cys Gly Asp 
130 135 140 



Lys Ala Ser Gly Phe His Tyr Gly val His Ala cys Glu Gly Cys Lys 
145 150 155 160 



Gly Phe Phe Arg Arg Thr lie Arg Leu Lys Leu lie Tyr Asp Arg Cys 



165 



170 
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Asp Leu Asn Cys Arg lie His Lys Lys ser Arg Asn Lys Cys Gin Tyr 
180 185 190 

cys Arg Phe Gin Lys Cys Leu Ala val Gly Met ser His Asn Ala lie 
195 200 205 

Arg Phe Gly Arg Met Pro Gin Ala Glu Lys Glu Lys Leu Leu Ala Glu 
210 215 220 

lie Ser ser Asp lie Asp Gin Leu Asn Pro Glu ser Ala Asp Leu Arg 
225 230 235 240 

Ala Leu Ala Lys His Leu Tyr Asp Ser Tyr lie Lys Ser Phe Pro Leu 
245 250 255 

Thr Lys Ala Lys Ala Arg Ala lie Leu Thr Gly Lys Thr Thr Asp Lys 
260 " 265 270 

Ser Pro Phe val lie Tyr Asp Met Asn Ser Leu Met Met Gly Glu Asp 
275 280 285 

Lys lie Lys Phe Lys His lie Thr Pro Leu Gin Glu Gin Ser Lys Glu 
290 295 300 

val Ala lie Arg lie Phe Gin Gly Cys Gin Phe Arg Ser val Glu Ala 
305 ~ 310 315 ~ 320 

Val Gin Glu lie Thr Glu Tyr Ala Lys ser lie Pro Gly Phe val Asn 
325 330 335 

Leu Asp Leu Asn Asp Gin val Thr Leu Leu Lys Tyr Gly Val His Glu 
340 345 350 

lie lie Tyr Thr Met Leu Ala Ser Leu Met Asn Lys Asp Gly val Leu 
355 360 365 

lie ser Glu Gly Gin Gly Phe Met Thr Arg Glu Phe Leu Lys ser Leu 
370 375 ~ 380 

Arg Lys Pro Phe Gly Asp Phe Met Glu Pro Lys Phe Glu Phe Ala val 
385 390 395 400 

Lys Phe Asn Ala Leu Glu Leu Asp Asp ser Asp Leu Ala lie Phe lie 
405 410 415 

Ala val lie lie Leu ser Gly Asp Arg Pro Gly Leu Leu Asn val Lys 
420 42 5 430 
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pro He G"lu Asp lie Gin Asp Asn Leu Leu Gin Ala Leu Glu Leu Gin 
435 440 445 

Leu Lys Leu Asn His Pro Glu Ser Ser Gin Leu Phe Ala Lys Leu Leu 
450 455 460 

Gin Lys Met Thr Asp Leu Arg Gin lie val Thr Glu His val Gin Leu 
465 470 ~ 475 480 

Leu Gin Val lie Lys Lys Thr Glu Thr Asp Met Ser Leu His Pro Leu 
485 490 495 

Leu Gin Glu lie Tyr Lys Asp Leu Tyr 
500 505 

<210> 29 
<211> 441 
<212> PRT 
<213> HUMAN 

<400> 29 

Met Glu Gin Pro Gin Glu Glu Ala Pro Glu Val Arg Glu Glu Glu Glu 
1 5 10 15 

Lys Glu Glu Val Ala Glu Ala Glu Gly Ala Pro Glu Leu Asn Gly Gly 
20 25 30 

Pro Gin His Ala Leu Pro Ser ser ser Tyr Thr Asp Leu Ser Arg Ser 
35 40 45 

Ser Ser Pro Pro Ser Leu Leu Asp Gin Leu Gin Met Gly Cys Asp Gly 
50 55 60 

Ala Ser cys Gly Ser Leu Asn Met Glu Cys Arg val cys Gly Asp Lys 
65 70 75 80 

Ala ser Gly Phe His Tyr Gly val His Ala cys Glu Gly cys Lys Gly 
85 90 95 

Phe Phe Arg Arg Thr lie Arg Met Lys Leu Glu Tyr Glu Lys Cys Glu 
100 ~ 105 110 

Arq ser Cys Lys lie Gin Lys Lys Asn Arg Asn Lys Cys Gin Tyr Cys 
y 115 120 125 

Ara Phe Gin Lys Cys Leu Ala Leu Gly Met Ser His Asn Ala lie Arg 
130 135 140 
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Phe Gly Arg Met Pro Glu Ala Glu Lys Arg Lys Leu val Ala Gly Leu 
145 150 155 160 

Thr Ala Asn Glu Gly Ser Gin Tyr Asn Pro Gin val Ala Asp Leu Lys 
165 170 175 

Ala Phe Ser Lys His lie Tyr Asn Ala Tyr Leu Lys Asn Phe Asn Met 
180 185 190 

Thr Lys Lys Lys Ala Arg Ser lie Leu Thr Gly Lys Ala Ser His Thr 
195 200 205 

Ala Pro Phe val lie His Asp lie Glu Thr Leu Trp Gin Ala Glu Lys 
210 215 220 

Gly Leu val Trp Lys Gin Leu Val Asn Gly Leu Pro Pro Tyr Lys Glu 
225 230 235 240 

lie ser val His val Phe Tyr Arg Cys Gin cys Thr Thr val Glu Thr 
245 250 255 

val Arg Glu Leu Thr Glu Phe Ala Lys ser lie Pro ser Phe ser ser 
260 265 270 

Leu Phe Leu Asn Asp Gin Val Thr Leu Leu Lys Tyr Gly val His Glu 
275 280 285 

Ala lie Phe Ala Met Leu Ala Ser lie val Asn Lys Asp Gly Leu Leu 
290 295 300 

Val Ala Asn Gly Ser Gly Phe Val Thr Arg Glu Phe Leu Arg ser Leu 
305 310 315 320 

Arq Lys Pro Phe ser Asp lie lie Glu Pro Lys Phe Glu Phe Ala val 
325 330 335 

Lys Phe Asn Ala Leu Glu Leu Asp Asp ser Asp Leu Ala Leu Phe lie 
340 345 350 

Ala Ala lie lie Leu Cys Gly Asp Arg Pro Gly Leu Met Asn val Pro 
355 360 365 

Arg val Glu Ala lie Gin Asp Thr lie Leu Arg Ala Leu Glu Phe His 
370 375 380 

Leu Gin Ala Asn His Pro Asp Ala Gin Tyr Leu Phe Pro Lys Leu Leu 
385 390 395 400 
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Gin Lys Met Ala Asp Leu Arg Gin Leu val Thr Glu His Ala Gin Met 
405 410 415 

Met Gin Arg lie Lys Lys Thr Glu Thr Glu Thr Ser Leu His Pro Leu 
420 425 430 

Leu Gin Glu lie Tyr Lys Asp Met Tyr 
435 440 

<210> 30 

<211> 742 

<212> PRT 

<213> HUMAN 

<400> 30 

Met Asp Lys Phe Trp Trp His Ala Ala Trp Gly Leu Cys Leu Val Pro 
15 10 15 

Leu Ser Leu Ala Gin lie Asp Leu Asn lie Thr cys Arg Phe Ala Gly 
20 25 30 

val Phe His val Glu Lys Asn Gly Arg Tyr Ser lie Ser Arg Thr Glu 
35 40 45 

Ala Ala Asp Leu Cys Lys Ala Phe Asn Ser Thr Leu Pro Thr Met Ala 
50 55 60 

Gin Met Glu Lys Ala Leu Ser lie Gly Phe Glu Thr cys Arg Tyr Gly 
65 70 75 80 

Phe lie Glu Gly His val val lie Pro Arg lie His Pro Asn Ser lie 
85 90 95 

Cys Ala Ala Asn Asn Thr Gly val Tyr lie Leu Thr Ser Asn Thr Ser 
100 105 110 

Gin Tyr Asp Thr Tyr cys Phe Asn Ala Ser Ala Pro Pro Glu Glu Asp 
115 120 125 

Cys Thr ser val Thr Asp Leu Pro Asn Ala Phe Asp Gly Pro lie Thr 
130 135 140 

lie Thr lie val Asn Arg Asp Gly Thr Arg Tyr Val Gin Lys Gly Glu 
145 150 155 160 

Tyr Arg Thr Asn Pro Glu Asp lie Tyr Pro Ser Asn Pro Thr Asp Asp 
165 170 175 
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Asp val Ser ser Gly ser ser Ser Glu Arg Ser Ser Thr Ser Gly Gly 
180 185 190 

Tyr lie Phe Tyr Thr Phe ser Thr val His pro lie Pro Asp Glu Asp 
195 200 205 

Ser Pro Trp lie Thr Asp Ser Thr Asp Arg lie Pro Ala Thr Thr Leu 
210 215 ~ 220 

Met Ser Thr Ser Ala Thr Ala Thr Glu Thr Ala Thr Lys Arg Gin Glu 
225 230 235 240 

Thr Trp Asp Trp Phe Ser Trp Leu Phe Leu Pro Ser Glu Ser Lys Asn 
245 250 255 

His Leu His Thr Thr Thr Gin Met Ala Gly Thr Ser Ser Asn Thr lie 
260 265 270 

Ser Ala Gly Trp Glu Pro Asn Glu Glu Asn Glu Asp Glu Arg Asp Arg 
275 280 285 

His Leu Ser Phe Ser Gly Ser Gly lie Asp Asp Asp Glu Asp Phe lie 
290 295 300 

Ser Ser Thr lie Ser Thr Thr Pro Arg Ala Phe Asp His Thr Lys Gin 
305 310 315 320 

Asn Gin Asp Trp Thr Gin Trp Asn Pro Ser His Ser Asn Pro Glu Val 
325 330 335 

Leu Leu Gin Thr Thr Thr Arg Met Thr Asp val Asp Arg Asn Gly Thr 
340 345 350 

Thr Ala Tyr Glu Gly Asn Trp Asn Pro Glu Ala His Pro Pro Leu lie 
355 360 365 

His His Glu His His Glu Glu Glu Glu Thr Pro His Ser Thr ser Thr 
370 375 380 

lie Gin Ala Thr Pro Ser Ser Thr Thr Glu Glu Thr Ala Thr Gin Lys 
385 390 395 400 

Glu Gin Trp Phe Gly Asn Arg Trp His Glu Gly Tyr Arg Gin Thr Pro 
405 410 415 

Lys Glu Asp Ser His Ser Thr Thr Gly Thr Ala Ala Ala Ser Ala His 
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Thr ser His Pro Met Gin Gly Arg Thr Thr Pro Ser Pro Glu Asp ser 
435 440 445 

Ser Trp Thr Asp Phe Phe Asn Pro lie Ser His Pro Met Gly Arg Gly 
450 455 460 

His Gin Ala Gly Arg Arg Met Asp Met Asp Ser Ser His Ser lie Thr 
465 470 475 480 

Leu Gin Pro Thr Ala Asn Pro Asn Thr Gly Leu Val Glu Asp Leu Asp 
485 490 495 

Arg Thr Gly Pro Leu Ser Met Thr Thr Gin Gin Ser Asn Ser Gin Ser 
500 505 510 

Phe Ser Thr Ser His Glu Gly Leu Glu Glu Asp Lys Asp His Pro Thr 
515 520 525 

Thr Ser Thr Leu Thr Ser Ser Asn Arg Asn Asp val Thr Gly Gly Arg 
530 535 540 

Arg Asp Pro Asn His ser Glu Gly Ser Thr Thr Leu Leu Glu Gly Tyr 
545 550 555 560 

Thr Ser His Tyr Pro His Thr Lys Glu Ser Arg Thr Phe lie Pro val 
565 570 ~ 575 

Thr Ser Ala Lys Thr Gly Ser Phe Gly val Thr Ala val Thr Val Gly 
580 585 590 

Asp Ser Asn Ser Asn Val Asn Arg Ser Leu Ser Gly Asp Gin Asp Thr 
595 600 605 

Phe His Pro Ser Gly Gly Ser His Thr Thr His Gly Ser Glu Ser Asp 
610 615 620 

Gly His Ser His Gly Ser Gin Glu Gly Gly Ala Asn Thr Thr ser Gly 
625 630 635 640 

Pro lie Arg Thr Pro Gin lie Pro Glu Trp Leu lie lie Leu Ala Ser 
645 650 655 

Leu Leu Ala Leu Ala Leu lie Leu Ala val Cys lie Ala Val Asn Ser 
660 665 670 
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Arg Arg Arg Cys Gly Gin Lys Lys Lys Leu val lie Asn Ser Gly Asn 
675 680 685 

Gly Ala val Glu Asp Arg Lys Pro Ser Gly Leu Asn Gly Glu Ala Ser 
690 695 700 

Lys ser Gin Glu Met val His Leu val Asn Lys Glu Ser Ser Glu Thr 
705 710 715 720 

Pro Asp Gin Phe Met Thr Ala Asp Glu Thr Arg Asn Leu Gin Asn val 
725 730 " 735 

Asp Met Lys lie Gly val 
740 

<210> 31 
<211> 489 
<212> PRT 
<213> HUMAN 

<400> 31 

Met Leu Met Arg Leu val Leu Thr val Arg Ser Asn Leu lie Pro Ser 
15 10 15 

pro Pro Thr Tyr Asn Ser Ala His Asp Tyr lie Ser Trp Glu Ser Phe 
20 25 30 

ser Asn val Ser Tyr Tyr Thr Arg lie Leu Pro Ser val Pro Lys Asp 
35 40 45 

Cys Pro Thr Pro Met Gly Thr Lys Gly Lys Lys Gin Leu Pro Asp Ala 
50 55 60 

Gin Leu Leu Ala Arg Arg Phe Leu Leu Arg Arg Lys Phe lie Pro Asp 
65 70 75 80 

Pro Gin Gly Thr Asn Leu Met Phe Ala Phe Phe Ala Gin His Phe Thr 
85 90 95 

His Gin Phe Phe Lys Thr Ser Gly Lys Met Gly Pro Gly Phe Thr Lys 
100 105 110 

Ala Leu Gly His Gly Val Asp Leu Gly His lie Tyr Gly Asp Asn Leu 
115 120 125 

Glu Arg Gin Tyr Gin Leu Arg Leu Phe Lys Asp Gly Lys Leu Lys Tyr 
130 135 140 
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Gin val Leu Asp Gly Glu Met Tyr Pro Pro Ser val Glu Glu Ala Pro 
145 150 155 160 

Val Leu Met His Tyr Pro Arg Gly lie Pro Pro Gin Ser Gin Met Ala 
165 ~ 170 175 

Val Gly Gin Glu val Phe Gly Leu Leu Pro Gly Leu Met Leu Tyr Ala 
180 185 190 

Thr Leu Trp Leu Arg Glu His Asn Arg val Cys Asp Leu Leu Lys Ala 
195 200 205 

Glu His Pro Thr Trp Gly Asp Glu Gin Leu Phe Gin Thr Thr Arg Leu 
210 215 220 

lie Leu lie Gly Glu Thr lie Lys lie val lie Glu Glu Tyr val Gin 
225 230 235 240 

Gin Leu Ser Gly Tyr Phe Leu Gin Leu Lys Phe Asp Pro Glu Leu Leu 
245 250 255 

Phe Gly val Gin Phe Gin Tyr Arg Asn Arg lie Ala Met Glu Phe Asn 
260 265 ~ 270 

His Leu Tyr His Trp His Pro Leu Met Pro Asp Ser Phe Lys val Gly 
275 280 285 

Ser Gin Glu Tyr Ser Tyr Glu Gin Phe Leu Phe Asn Thr Ser Met Leu 
290 295 300 

val Asp Tyr Gly val Glu Ala Leu val Asp Ala Phe Ser Arg Gin lie 
305 310 315 " 320 

Ala Gly Arg lie Gly Gly Gly Arg Asn Met Asp His His lie Leu His 
325 330 335 

val Ala val Asp val lie Arg Glu ser Arg Glu Met Arg Leu Gin Pro 
340 345 350 

Phe Asn Glu Tyr Arg Lys Arg Phe Gly Met Lys Pro Tyr Thr Ser Phe 
355 360 365 

Gin Glu Leu val Gly Glu Lys Glu Met Ala Ala Glu Leu Glu Glu Leu 
370 375 380 

Tyr Gly Asp lie Asp Ala Leu Glu Phe Tyr Pro Gly Leu Leu Leu Glu 
385 390 395 400 

44 



WO 2006/039405 PCT/US2005/035027 

NLEE01001WO0 . ST2 5 . txt 

Lys Cys His Pro Asn Ser lie Phe Gly Glu Ser Met lie Glu lie Gly 
405 410 415 

Ala Pro Phe Ser Leu Lys Gly Leu Leu Gly Asn Pro lie Cys Ser Pro 
420 425 430 

Glu Tyr Trp Lys Pro Ser Thr Phe Gly Gly Glu val Gly Phe Asn lie 
435 440 445 

val Lys Thr Ala Thr Leu Lys Lys Leu val Cys Leu Asn Thr Lys Thr 
450 455 460 

Cys Pro Tyr val Ser Phe Arg Val Pro Asp Ala Ser Gin Asp Asp Gly 
465 470 475 480 

Pro Ala val Glu Arg Pro Ser Thr Glu 
485 

<210> 32 
<211> 122 
<212> PRT 
<213> HUMAN 

<400> 32 

Met Lys Leu Leu Thr Gly Leu val Phe Cys Ser Leu val Leu Gly val 
15 10 15 

Ser Ser Arg Ser Phe Phe Ser Phe Leu Gly Glu Ala Phe Asp Gly Ala 
20 25 30 

Arg Asp Met Trp Arg Ala Tyr Ser Asp Met Arg Glu Ala Asn Tyr lie 
35 40 45 

Gly Ser Asp Lys Tyr Phe His Ala Arg Gly Asn Tyr Asp Ala Ala Lys 
50 55 60 

Arg Gly Pro Gly Gly val Trp Ala Ala Glu Ala lie ser Asp Ala Arg 
65 70 75 80 

Glu Asn lie Gin Arg Phe Phe Gly His Gly Ala Glu Asp Ser Leu Ala 
85 90 95 

Asp Gin Ala Ala Asn Glu Trp Gly Arg ser Gly Lys Asp Pro Asn His 
100 105 110 

Phe Arg Pro Ala Gly Leu Pro Glu Lys Tyr 
115 120 
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<210> 33 
<211> 26 
<212> DNA 
<213> HUMAN 

<400> 33 

agatattgca cgggagaata tacaaa 26 

<210> 34 

<211> 27 

<212> DNA 

<213> HUMAN 

<400> 34 

tcaattcctg aaattaaagt tcggata 27 

<210> 35 

<211> 23 

<212> DNA 

<213> HUMAN 

<400> 35 

tctgcagagt tggaagcact eta 23 

<210> 36 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 36 

gecgaggett ttctaccaga a 21 

<210> 37 

<211> 20 

<212> DNA 

<213> HUMAN 

<400> 37 

catggcttga tcagcaagga 20 

<210> 38 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 38 

tggaagtgtg ccctgaagaa g 21 

<210> 39 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 39 

aagcagcacc agcaagtgaa g 21 
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<210> 40 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 40 

tcatggcctg tgtcagtcaa a 



21 



<210> 41 

<211> 22 

<212> DNA 

<213> HUMAN 

<400> 41 

acatgccagc cactgtgata ga 



22 



<210> 42 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 42 

ccctgccttc acaatgatct c 



21 



<210> 43 

<211> 23 

<212> DNA 

<213> HUMAN 

<400> 43 

ggaattcacc tcaagaacat cca 



23 



<210> 44 

<211> 23 

<212> DNA 

<213> HUMAN 

<400> 44 

agtgtggcta tgacttcggt ttg 



23 



<210> 45 

<211> 22 

<212> DNA 

<213> HUMAN 

<400> 45 

cagccacaag cagtccagat ta 



22 



<210> 46 

<211> 24 

<212> DNA 

<213> HUMAN 

<400> 46 

cctgactatc aatcacatcg gaat 



24 
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<210> 47 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 47 

ccaggtgctc cacatgacag t 



21 



<210> 48 

<211> 24 

<212> DNA 

<213> HUMAN 

<400> 48 

aaacaaccaa caacaaggag aatg 



24 



<210> 49 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 49 

cgtctccaca catcagcaca a 



21 



<210> 50 

<211> 22 

<212> DNA 

<213> HUMAN 

<400> 50 

tcttggcagc aggatagtcc tt 



22 



<210> 51 

<211> 22 

<212> DNA 

<213> HUMAN 

<400> 51 

gcagaccagc atgacagatt tc 



22 



<210> 52 

<211> 20 

<212> DNA 

<213> HUMAN 

<400> 52 

gcggattagg gcttcctctt 



20 



<210> 53 

<211> 23 

<212> DNA 

<213> HUMAN 

<400> 53 

tgaagttcaa tgcactggaa ctg 



23 
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<210> 54 
<211> 20 
<212> DNA 
<213> HUMAN 

<400> 54 

caggacgatc tccacagcaa 20 

<210> 55 

<211> 23 

<212> DNA 

<213> HUMAN 

<400> 55 

tggagtccac gagatcattt aca 23 

<210> 56 

<211> 19 

<212> DNA 

<213> HUMAN 

<400> 56 

agccttggcc ctcggatat 19 

<210> 57 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 57 

cactgagttc gccaagagca t 21 

<210> 58 

<211> 23 

<212> DNA 

<213> HUMAN 

<400> 58 

cacgccatac ttgagaaggg taa 23 

<210> 59 

<2U> 23 

<212> DNA 

<213> HUMAN 

<400> 59 

gctagtgatc aacagtggca atg 23 

<210> 60 

<211> 18 

<212> DNA 

<213> HUMAN 

<400> 60 

gctggcctct ccgttgag 18 
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<210> 61 
<211> 22 
<212> DNA 
<213> HUMAN 

<400> 61 

tgttcggtgt ccagttccaa ta 22 

<210> 62 

<211> 22 

<212> DNA 

<213> HUMAN 

<400> 62 

tgccagtggt agagatggtt ga 22 

<210> 63 

<211> 22 

<212> DNA 

<213> HUMAN 

<400> 63 

gggacatgtg gagagcctac tc 22 

<210> 64 

<211> 21 

<212> DNA 

<213> HUMAN 

<400> 64 

catcatagtt cccccgagca t 21 



50 



